*follows
Tesla Optimus bots were controlled by humans during the ‘We, Robot’ event

During Tesla’s “We, Robot” event last week, which TechCrunch covered late into the night, sources on the ground sent me a handful of videos of the automaker’s Optimus humanoid robots walking around the party, dancing, mixing drinks, and talking to guests. Most, if not all, of those who attended the affair are Tesla investors and […]

Business Read on TechCrunch
Apple study exposes deep cracks in LLMs’ “reasoning” capabilities

For a while now, companies like OpenAI and Google have been touting advanced "reasoning" capabilities as the next big step in their latest artificial intelligence models. Now, though, a new study from six Apple engineers shows that the mathematical "reasoning" displayed by advanced large language models can be extremely brittle and unreliable in the face of seemingly trivial changes to common benchmark problems. The fragility highlighted in these new results helps support previous research suggesting that LLMs use of probabilistic pattern matching is missing the formal understanding of underlying concepts needed for truly reliable mathematical reasoning capabilities. "Current LLMs are not capable of genuine logical reasoning," the researchers hypothesize based on these results. "Instead, they attempt to replicate the reasoning steps observed in their training data." In "GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models"—currently available as a pre-print paper—the six Apple researchers start with GSM8K's standardized set of over 8,000 grade-school level mathematical word problems, which is often used as a benchmark for modern LLMs' complex reasoning capabilities. They then take the novel approach of modifying a portion of that testing set to dynamically replace certain names and numbers with new values—so a question about Sophie getting 31 building blocks for her nephew in GSM8K could become a question about Bill getting 19 building blocks for his brother in the new GSM-Symbolic evaluation. Read full article

Politics Read on Ars Technica
News Image When Jedi Were Totally Okay With Using Force Lightning

The gamification of Star Wars' idea of Force powers has led to some pretty weird moments over the years.

Entertainment Read on Gizmodo
News Image Pokémon developer faces major data leak

Hackers released a collection of leaked data from Pokémon game developer Game Freak over the weekend, including personal information about employees. Game Freak — which develops the main lineup of Pokémon video games — confirmed the breach in a statement, saying (per a machine translation from Japanese) that it was the result of “unauthorized access to our servers by a third party” and dated back to August of 2024. Game Freak said the leaked personal information — which it characterizes as names and company email addresses — included around 2,600 items. As Polygon notes, however, the breach appears to include much more than employee information. Redditors and others say they’ve unearthed source code from previous games as well as unused...

Business Read on The Verge
Former Product Hunt CEO Josh Buckley is looking to raise a fourth $250M fund

Josh Buckley, the former CEO of Product Hunt, is aiming to raise a fourth $250 million fund for his venture capital firm, Buckley Ventures, according to a regulatory filing. Buckley’s ambitions for this fund are significantly lower than for his previous one. He sought to raise a $500 million third fund in February, 2022, right […]

Business Read on TechCrunch
Malaysia asks Interpol to find American couple tied to Dutch model Ivana Smit’s death

The Malaysian police have issued a blue notice to Interpol, asking the policing service for the whereabouts of an American couple regarding the death of Dutch model Ilana Smit in Kuala Lu

Crime and Courts Read on NL Times
Expert witness used Copilot to make up fake damages, irking judge

A New York judge recently called out an expert witness for using Microsoft's Copilot chatbot to inaccurately estimate damages in a real estate dispute that partly depended on an accurate assessment of damages to win. In an order Thursday, judge Jonathan Schopf warned that "due to the nature of the rapid evolution of artificial intelligence and its inherent reliability issues" that any use of AI should be disclosed before testimony or evidence is admitted in court. Admitting that the court "has no objective understanding as to how Copilot works," Schopf suggested that the legal system could be disrupted if experts started overly relying on chatbots en masse. His warning came after an expert witness, Charles Ranson, dubiously used Copilot to cross-check calculations in a dispute over a $485,000 rental property in the Bahamas that had been included in a trust for a deceased man's son. The court was being asked to assess if the executrix and trustee—the deceased man's sister—breached her fiduciary duties by delaying the sale of the property while admittedly using it for personal vacations. Read full article

Business Read on Ars Technica
Ward Christensen, BBS inventor and architect of our online age, dies at age 78

Ward Christensen, co-inventor of the computer bulletin board system (BBS), has died at age 78 in Rolling Meadows, Illinois. He was found deceased at his home on Friday after friends requested a wellness check. Christensen, along with Randy Suess, created the first BBS in Chicago in 1978, leading to an important cultural era of digital community-building that presaged much of our online world today. In the 1980s and 1990s, BBSes introduced many home computer users to multiplayer online gaming, message boards, and online community building in an era before the Internet became widely available to people outside of science and academia. It also gave rise to the shareware gaming scene that led to companies like Epic Games today. Friends and associates remember Christensen as humble and unassuming, a quiet innovator who never sought the spotlight for his groundbreaking work. Despite creating one of the foundational technologies of the digital age, Christensen maintained a low profile throughout his life, content with his long-standing career at IBM and showing no bitterness or sense of missed opportunity as the Internet age dawned. Read full article

Science Read on Ars Technica
News Image Silo’s new season 2 trailer teases what’s next for Juliette

Apple has released the first trailer for the second season of Silo, and it looks like the season will tell us what happens to protagonist Juliette after the jaw-dropping cliffhanger at the end of season one. The show, based on a series of books by Hugh Howey, is about a community of 10,000 people living in an underground silo that’s intended to protect them from dangerous conditions aboveground. If you’ve been meaning to see the first season and haven’t yet, you probably shouldn’t watch this new trailer; as you might have guessed, it has quite a few mysteries that are fun to experience for yourself. Silo’s second season premieres on Apple TV Plus on November 15th. The season will have 10 episodes, with a new episode each week. The season...

Entertainment Read on The Verge Tech
News Image How Hysteria! Brings Satanic Panic’s Retro Themes Into 2024

Writer-executive producers Matthew Scott Kane and David A. Goodman on their new 1980s-set horror series, coming October 18 to Peacock.

Entertainment Read on Gizmodo