每日研究简报 2026-04-25

📅 2026-04-25 📂 研究简报 ⏱️ 7 分钟阅读

📅 生成时间：2026-04-25 22:50 (Asia/Shanghai) | 数据来源：arXiv · GitHub · HackerNews · 科技媒体 · 大厂博客

📄 一、arXiv 最新论文

1. Seeing Fast and Slow: Learning the Flow of Time in Videos

方向：arXiv/CV
摘要：How can we tell whether a video has been sped up or slowed down? How can we generate videos at different speeds? Although videos have been central to modern computer vision research, little attention has been paid to perceiving and controlling the passage of time. In this paper, we study time as a learnable visual concept and develop models for reasoning about and manipulating the flow of time in …
推荐原因：计算机视觉在感知层的突破持续，尤其是与深度学习的结合带来了精度大幅提升。
链接：https://arxiv.org/abs/2604.21931v1

2. Seeing Without Eyes: 4D Human-Scene Understanding from Wearable IMUs

方向：arXiv/CV
摘要：Understanding human activities and their surrounding environments typically relies on visual perception, yet cameras pose persistent challenges in privacy, safety, energy efficiency, and scalability. We explore an alternative: 4D perception without vision. Its goal is to reconstruct human motion and 3D scene layouts purely from everyday wearable sensors. For this we introduce IMU-to-4D, a framewor…
推荐原因：模型安全评测是保证大模型可靠部署的基础。
链接：https://arxiv.org/abs/2604.21926v1

3. Temporal Taskification in Streaming Continual Learning: A Source of Evaluation Instability

方向：arXiv/LG
摘要：Streaming Continual Learning (CL) typically converts a continuous stream into a sequence of discrete tasks through temporal partitioning. We argue that this temporal taskification step is not a neutral preprocessing choice, but a structural component of evaluation: different valid splits of the same stream can induce different CL regimes and therefore different benchmark conclusions. To study this…
推荐原因：端侧视频编解码的 AI 化正在改变流媒体传输效率。
链接：https://arxiv.org/abs/2604.21930v1

4. Fine-Tuning Regimes Define Distinct Continual Learning Problems

方向：arXiv/LG
摘要：Continual learning (CL) studies how models acquire tasks sequentially while retaining previously learned knowledge. Despite substantial progress in benchmarking CL methods, comparative evaluations typically keep the fine-tuning regime fixed. In this paper, we argue that the fine-tuning regime, defined by the trainable parameter subspace, is itself a key evaluation variable. We formalize adaptation…
推荐原因：模型安全评测是保证大模型可靠部署的基础。
链接：https://arxiv.org/abs/2604.21927v1

5. When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs

方向：arXiv/AI
摘要：Despite impressive progress in capabilities of large vision-language models (LVLMs), these systems remain vulnerable to hallucinations, i.e., outputs that are not grounded in the visual input. Prior work has attributed hallucinations in LVLMs to factors such as limitations of the vision backbone or the dominance of the language component, yet the relative importance of these factors remains unclea…
推荐原因：是近期技术趋势的典型案例，有一定代表性。
链接：https://arxiv.org/abs/2604.21911v1

6. From Research Question to Scientific Workflow: Leveraging Agentic AI for Science Automation

方向：arXiv/AI
摘要：Scientific workflow systems automate execution – scheduling, fault tolerance, resource management – but not the semantic translation that precedes it. Scientists still manually convert research questions into workflow specifications, a task requiring both domain knowledge and infrastructure expertise. We propose an agentic architecture that closes this gap through three layers: an LLM interprets…
推荐原因：Web/代码 Agent 落地加速，工程实践价值显著。
链接：https://arxiv.org/abs/2604.21910v1

7. Evaluation of Automatic Speech Recognition Using Generative Large Language Models

方向：arXiv/CL
摘要：Automatic Speech Recognition (ASR) is traditionally evaluated using Word Error Rate (WER), a metric that is insensitive to meaning. Embedding-based semantic metrics are better correlated with human perception, but decoder-based Large Language Models (LLMs) remain underexplored for this task. This paper evaluates their relevance through three approaches: (1) selecting the best hypothesis between tw…
推荐原因：视觉-语言模型（VLM）弥合了感知与理解之间的鸿沟，是当前最活跃的研究方向之一。
链接：https://arxiv.org/abs/2604.21928v1

8. MathDuels: Evaluating LLMs as Problem Posers and Solvers

方向：arXiv/CL
摘要：As frontier language models attain near-ceiling performance on static mathematical benchmarks, existing evaluations are increasingly unable to differentiate model capabilities, largely because they cast models solely as solvers of fixed problem sets. We introduce MathDuels, a self-play benchmark in which models occupy dual roles: each authors math problems under adversarial prompting and solves pr…
推荐原因：模型安全评测是保证大模型可靠部署的基础。
链接：https://arxiv.org/abs/2604.21916v1

🌟 二、GitHub 热门项目

1. kyegomez/OpenMythos

Stars：⭐ 10,274 · Python
简介：A theoretical reconstruction of the Claude Mythos architecture, built from first principles using the available research literature.
推荐原因：从社区反馈来看有一定价值，可以快速浏览。
链接：https://github.com/kyegomez/OpenMythos

2. alchaincyf/huashu-design

Stars：⭐ 6,366 · HTML
简介：Huashu Design · HTML-native design skill for Claude Code · Claude Code 里 HTML 原生的设计 skill · 高保真原型 / 幻灯片 / 动画 + 20 设计哲学 + 5 维评审 + MP4 导出 · Agent-agnostic
推荐原因：多 Agent 协作是今年最活跃的研究方向之一，展示了 AI 系统自动化的新可能。
链接：https://github.com/alchaincyf/huashu-design

3. ConardLi/garden-skills

Stars：⭐ 1,182 · CSS
简介：An AI agent skill that transforms AI-generated web pages from “functional” to “stunning."(Inspired by Claude Design)
推荐原因：多 Agent 协作是今年最活跃的研究方向之一，展示了 AI 系统自动化的新可能。
链接：https://github.com/ConardLi/garden-skills

4. leigest519/OpenGame

Stars：⭐ 1,057 · TypeScript
简介：OpenGame: Open Agentic Coding for Games
推荐原因：多 Agent 协作是今年最活跃的研究方向之一，展示了 AI 系统自动化的新可能。
链接：https://github.com/leigest519/OpenGame

5. cosmicstack-labs/mercury-agent

Stars：⭐ 891 · TypeScript
简介：Soul-driven AI agent with permission-hardened tools, token budgets, and multi-channel access. Runs 24/7 from CLI or Telegram.
推荐原因：Agent 是当前 AI 工程化的核心方向，代表了大模型从「对话」到「执行」的关键跃迁。
链接：https://github.com/cosmicstack-labs/mercury-agent

6. ZeroZ-lab/cc-design

Stars：⭐ 630 · JavaScript
简介：High-fidelity HTML design and prototype guidance skill for AI agents
推荐原因：是近期技术趋势的典型案例，有一定代表性。
链接：https://github.com/ZeroZ-lab/cc-design

7. GammaLabTechnologies/harmonist

Stars：⭐ 521 · Python
简介：Portable AI agent orchestration with mechanical protocol enforcement. 186 agents, zero runtime dependencies.
推荐原因：Agent 是当前 AI 工程化的核心方向，代表了大模型从「对话」到「执行」的关键跃迁。
链接：https://github.com/GammaLabTechnologies/harmonist

8. TheRealSeanDonahoe/agents-md

Stars：⭐ 510
简介：Drop-in AGENTS.md that makes every coding agent behave like a senior engineer instead of an eager intern. Kills sycophancy, stops drive-by refactors, forces verification loops. Synthesizes Karpathy’s
推荐原因：多 Agent 协作是今年最活跃的研究方向之一，展示了 AI 系统自动化的新可能。
链接：https://github.com/TheRealSeanDonahoe/agents-md

📰 三、AI 科技媒体 & 大厂博客

1. Meta’s loss is Thinking Machines’ gain

来源：TechCrunch AI · AI 媒体
摘要：Meta has been poaching talent from Thinking Machines Lab. But it’s a two-way street.
推荐原因：提供了一些新的思路和视角，可以扩展知识面。
链接：https://techcrunch.com/2026/04/24/metas-loss-is-thinking-machines-gain/

2. ComfyUI hits $500M valuation as creators seek more control over AI-generated media

来源：TechCrunch AI · AI 媒体
摘要：ComfyUI, whose tools give creators more control over AI image, video, and audio generation, just raised $30 million.
推荐原因：TTS（文本转语音）在情感和表现力上的突破，使 AI 更具可用性。
链接：https://techcrunch.com/2026/04/24/comfyui-hits-500m-valuation-as-creators-seek-more-control-over-ai-generated-media/

3. Google to invest up to $40B in Anthropic in cash and compute

来源：TechCrunch AI · AI 媒体
摘要：Google plans up to $40B investment in Anthropic as AI rivals race to secure massive compute capacity, following the limited release of its powerful, cybersecurity-focused Mythos model.
推荐原因：AI 安全和对齐问题日益突出，评估体系和防护手段是重要研究方向。
链接：https://techcrunch.com/2026/04/24/google-to-invest-up-to-40b-in-anthropic-in-cash-and-compute/

4. Discord Sleuths Gained Unauthorized Access to Anthropic’s Mythos

来源：Wired AI · AI 媒体
摘要：Plus: Spy firms tap into a global telecom weakness to track targets, 500,000 UK health records go up for sale on Alibaba, Apple patches a revealing notification bug, and more.
推荐原因：引发了不少讨论，值得了解业界观点。
链接：https://www.wired.com/story/security-news-this-week-discord-sleuths-gained-unauthorized-access-to-anthropics-mythos/

5. Ace the Ping-Pong Robot Can Whup Your Ass

来源：Wired AI · AI 媒体
摘要：Ace can read the trajectory of a ball, adjust the racket angle, and respond with strokes that keep the exchange alive with real players.
推荐原因：是当前热门方向之一，了解一下没坏处。
链接：https://www.wired.com/story/ace-the-robot-wants-to-become-the-world-table-tennis-champion/

6. AI-Designed Drugs by a DeepMind Spinoff Are Headed to Human Trials

来源：Wired AI · AI 媒体
摘要：Isomorphic Labs president Max Jaderberg said at WIRED Health in London that the startup has built a “broad and exciting pipeline of new medicines.”
推荐原因：从社区反馈来看有一定价值，可以快速浏览。
链接：https://www.wired.com/story/wired-health-2026-how-ai-is-powering-drug-discovery-max-jaderberg/

7. Why are top university websites serving porn? It comes down to shoddy housekeeping.

来源：Ars Technica · AI 媒体
摘要：Hundreds of subdomains from dozens of universities have been hijacked by scammers.
推荐原因：从社区反馈来看有一定价值，可以快速浏览。
链接：https://arstechnica.com/security/2026/04/why-are-top-university-websites-serving-porn-it-comes-down-to-shoddy-housekeeping/

8. In a first, a ransomware family is confirmed to be quantum-safe

来源：Ars Technica · AI 媒体
摘要：Technically speaking, there’s no practical benefit to use PQC. So why is it being used?
推荐原因：模型安全评测是保证大模型可靠部署的基础。
链接：https://arstechnica.com/security/2026/04/now-even-ransomware-is-using-post-quantum-cryptography/

9. Microsoft issues emergency update for macOS and Linux ASP.NET threat

来源：Ars Technica · AI 媒体
摘要：When authentication fails, things can go very, very wrong.
推荐原因：提供了一些新的思路和视角，可以扩展知识面。
链接：https://arstechnica.com/security/2026/04/microsoft-issues-emergency-update-for-macos-and-linux-asp-net-threat/

10. Three reasons why DeepSeek’s new model matters

来源：MIT Tech Review · AI 媒体
摘要：On Friday, Chinese AI firm DeepSeek released a preview of V4, its long-awaited new flagship model. Notably, the model can process much longer prompts than its last generation, thanks to a new design t
推荐原因：HN 社区讨论热度高，反映了开发者社区的真实关注点。
链接：https://www.technologyreview.com/2026/04/24/1136422/why-deepseeks-v4-matters/

11. The Download: supercharged scams and studying AI healthcare

来源：MIT Tech Review · AI 媒体
摘要：This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. We’re in a new era of AI-driven scams When ChatGPT was release
推荐原因：技术实现有一定参考价值，可借鉴到类似项目中。
链接：https://www.technologyreview.com/2026/04/24/1136400/the-download-supercharged-scams-questionable-ai-healthcare/

12. Health-care AI is here. We don’t know if it actually helps patients.

来源：MIT Tech Review · AI 媒体
摘要：I don’t need to tell you that AI is everywhere. Or that it is being used, increasingly, in hospitals. Doctors are using AI to help them with notetaking. AI-based tools are trawling through patient rec
推荐原因：内容偏向工程实践，对实际项目有一定帮助。
链接：https://www.technologyreview.com/2026/04/24/1136352/health-care-ai-dont-know-actually-helps-patients/

🔥 四、HackerNews 近 48h 热门

1. DeepSeek v4

热度：1987 points · 💬 1516 comments
推荐原因：HN 热门文章，热度很高，强烈推荐。
链接：https://api-docs.deepseek.com/news/news260424
HN 讨论：https://news.ycombinator.com/item?id=47884971
高赞评论（原文+中文）： · hodgehog11 ：There are quite a few comments here about benchmark and coding performance. I would like to offer some opinions regarding its capacity for mathematics problems in an active research setting. I have a collection of novel probability and statistics problems at the masters and PhD level with varying… → 这里有很多关于基准测试和编码性能的评论。我想就其在活跃的研究环境中处理数学问题的能力提供一些意见。我在硕士和博士水平上有一系列新颖的概率和统计问题，这些问题各不相同…… · throwa356262 ：Seriously, why can’t huge companies like OpenAI and Google produce documentation that is half this good?? https://api-docs.deepseek.com/guides/thinking_mode No BS, just a concise description of exactly what I need to write my own agent. → 说真的，为什么像OpenAI和谷歌这样的大公司不能提供一半好的文档？？ https://api-docs.deepseek.com/guides/thinking_mode没有BS ，只是简明扼要地描述了我自己编写代理所需的内容。 · orbital-decay ：>we implement end-to-end, bitwise batch-invariant, and deterministic kernels with minimal performance overhead Pretty cool, I think they’re the first to guarantee determinism with the fixed seed or at the temperature 0. Google came close but never guaranteed it AFAIK. DeepSeek show their roots - it… → >我们以最小的性能开销实现端到端、按位批量不变和确定性内核相当酷，我认为它们是第一个使用固定种子或在温度为0时保证确定性的内核。谷歌接近了，但从未保证过AFAIK。DeepSeek展示了他们的根源–它……

2. GPT-5.5

热度：1552 points · 💬 1032 comments
推荐原因：HN 热门文章，热度很高，强烈推荐。
链接：https://openai.com/index/introducing-gpt-5-5/
HN 讨论：https://news.ycombinator.com/item?id=47879092
高赞评论（原文+中文）： · tedsanders ：Just as a heads up, even though GPT-5.5 is releasing today, the rollout in ChatGPT and Codex will be gradual over many hours so that we can make sure service remains stable for everyone (same as our previous launches). You may not see it right away, and if you don’t, try again later in the day. We… → 正如注意事项一样，即使GPT-5.5今天发布， ChatGPT和Codex的推出也将在许多小时内逐步进行，以便我们能够确保每个人都能保持稳定的服务（与我们之前的发布相同）。您可能不会立即看到它，如果没有，请在当天晚些时候再试一次。我们…… · simonw ：This doesn’t have API access yet, but OpenAI seem to approve of the Codex API backdoor used by OpenClaw these days… https://twitter.com/steipete/status/2046775849769148838 and https://twitter.com/romainhuet/status/2038699202834841962 And that backdoor API has GPT-5.5. So here’s a pelican:… → 这还没有API访问权限，但OpenAI似乎批准了OpenClaw最近使用的Codex API后门… https://twitter.com/steipete/status/2046775849769148838和https://twitter.com/romainhuet/status/2038699202834841962 ，该后门API具有GPT-5.5。这里有一只鹈鹕： …… · jfkimmes ：Everyone talked about the marketing stunt that was Anthropic’s gated Mythos model with an 83% result on CyberGym. OpenAI just dropped GPT 5.5, which scores 82% and is open for anybody to use. I recommend anybody in offensive/defensive cybersecurity to experiment with this. This is the real data… → 每个人都在谈论营销噱头，这是Anthropic的门控神话模型，在CyberGym上的结果为83%。OpenAI刚刚放弃了GPT 5.5 ，得分为82% ，任何人都可以使用。我建议任何从事进攻性/防御性网络安全的人尝试一下。这是真实的数据……

3. An update on recent Claude Code quality reports

热度：925 points · 💬 710 comments
推荐原因：HN 热门文章，热度很高，强烈推荐。
链接：https://www.anthropic.com/engineering/april-23-postmortem
HN 讨论：https://news.ycombinator.com/item?id=47878905
高赞评论（原文+中文）： · dgreensp ：This reveals a staggering level of incompetence, if that’s really all it is, and lack of transparency. They don’t have ANY product-level quality tests that picked this up? Many users did their own tests and published them. It’s not hard. And these users’ complaints were initially dismissed. I don’t… → 这揭示了惊人的无能，如果这就是全部，以及缺乏透明度。他们没有任何产品级别的质量测试来证明这一点？许多用户自己做了测试并发布了它们。这并不难。这些用户的投诉最初被驳回。我不…… · 6keZbCECT2uB ：“On March 26, we shipped a change to clear Claude’s older thinking from sessions that had been idle for over an hour, to reduce latency when users resumed those sessions. A bug caused this to keep happening every turn for the rest of the session instead of just once, which made Claude seem… → “3月26日，我们发布了一项更改，从闲置了一个多小时的会话中清除Claude的旧思维，以减少用户恢复这些会话时的延迟。在课程的剩余时间里，一个错误导致这种情况不断发生，而不仅仅是一次，这让克劳德看起来…… · cmenge ：Bit surprised about the amount of flak they’re getting here. I found the article seemed clear, honest and definitely plausible. The deterioration was real and annoying, and shines a light on the problematic lack of transparency of what exactly is going on behind the scenes and the somewhat… → 对他们到达这里的高射炮的数量感到有点惊讶。我发现这篇文章似乎清晰，诚实，绝对合理。这种恶化是真实而令人讨厌的，它揭示了幕后究竟发生了什么以及某种程度上……缺乏透明度的问题。

4. I cancelled Claude: Token issues, declining quality, and poor support

热度：905 points · 💬 536 comments
推荐原因：HN 热门文章，热度很高，强烈推荐。
链接：https://nickyreinert.de/en/2026/2026-04-24-claude-critics/
HN 讨论：https://news.ycombinator.com/item?id=47892019

5. Meta tells staff it will cut 10% of jobs

热度：791 points · 💬 859 comments
推荐原因：HN 热门文章，热度很高，强烈推荐。
链接：https://www.bloomberg.com/news/articles/2026-04-23/meta-tells-staff-it-will-cut-10-of-jobs-in-push-for-efficiency
HN 讨论：https://news.ycombinator.com/item?id=47879986

6. Google plans to invest up to $40B in Anthropic

热度：683 points · 💬 672 comments
推荐原因：HN 热门文章，热度很高，强烈推荐。
链接：https://www.bloomberg.com/news/articles/2026-04-24/google-plans-to-invest-up-to-40-billion-in-anthropic
HN 讨论：https://news.ycombinator.com/item?id=47892074

7. US special forces soldier arrested after allegedly winning $400k on Maduro raid

热度：673 points · 💬 726 comments
推荐原因：HN 热门文章，热度很高，强烈推荐。
链接：https://www.cnn.com/2026/04/23/politics/us-special-forces-soldier-arrested-maduro-raid-trade
HN 讨论：https://news.ycombinator.com/item?id=47882645

8. Sabotaging projects by overthinking, scope creep, and structural diffing

热度：476 points · 💬 115 comments
推荐原因：HN 讨论热烈（115 条评论），社区关注度高。
链接：https://kevinlynagh.com/newsletter/2026_04_overthinking/
HN 讨论：https://news.ycombinator.com/item?id=47890799

📚 深读推荐

类型	标题	方向	备注	链接
📄 论文	Seeing Fast and Slow: Learning the Flo…	CV		arXiv
📄 论文	Seeing Without Eyes: 4D Human-Scene Un…	CV		arXiv
📄 论文	Temporal Taskification in Streaming Co…	LG		arXiv
📄 论文	Fine-Tuning Regimes Define Distinct Co…	LG		arXiv
📄 论文	When Prompts Override Vision: Prompt-I…	AI		arXiv
🌟 项目	kyegomez/OpenMythos	GitHub	Python	GitHub
🌟 项目	alchaincyf/huashu-design	GitHub	HTML	GitHub
🌟 项目	ConardLi/garden-skills	GitHub	CSS	GitHub
🔥 热帖	DeepSeek v4	HN	1987 pts	HN
🔥 热帖	GPT-5.5	HN	1552 pts	HN
🔥 热帖	An update on recent Claude Code qualit…	HN	925 pts	HN
🔥 热帖	I cancelled Claude: Token issues, decl…	HN	905 pts	HN
🔥 热帖	Meta tells staff it will cut 10% of jo…	HN	791 pts	HN

📑 目录

每日研究简报 2026-04-25

📄 一、arXiv 最新论文

1. Seeing Fast and Slow: Learning the Flow of Time in Videos

2. Seeing Without Eyes: 4D Human-Scene Understanding from Wearable IMUs

3. Temporal Taskification in Streaming Continual Learning: A Source of Evaluation Instability

4. Fine-Tuning Regimes Define Distinct Continual Learning Problems

5. When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs

6. From Research Question to Scientific Workflow: Leveraging Agentic AI for Science Automation

7. Evaluation of Automatic Speech Recognition Using Generative Large Language Models

8. MathDuels: Evaluating LLMs as Problem Posers and Solvers

🌟 二、GitHub 热门项目

1. kyegomez/OpenMythos

2. alchaincyf/huashu-design

3. ConardLi/garden-skills

4. leigest519/OpenGame

5. cosmicstack-labs/mercury-agent

6. ZeroZ-lab/cc-design

7. GammaLabTechnologies/harmonist

8. TheRealSeanDonahoe/agents-md

📰 三、AI 科技媒体 & 大厂博客

1. Meta’s loss is Thinking Machines’ gain

2. ComfyUI hits $500M valuation as creators seek more control over AI-generated media

3. Google to invest up to $40B in Anthropic in cash and compute

4. Discord Sleuths Gained Unauthorized Access to Anthropic’s Mythos

5. Ace the Ping-Pong Robot Can Whup Your Ass

6. AI-Designed Drugs by a DeepMind Spinoff Are Headed to Human Trials

7. Why are top university websites serving porn? It comes down to shoddy housekeeping.

8. In a first, a ransomware family is confirmed to be quantum-safe

9. Microsoft issues emergency update for macOS and Linux ASP.NET threat

10. Three reasons why DeepSeek’s new model matters

11. The Download: supercharged scams and studying AI healthcare

12. Health-care AI is here. We don’t know if it actually helps patients.

🔥 四、HackerNews 近 48h 热门

1. DeepSeek v4

2. GPT-5.5

3. An update on recent Claude Code quality reports

4. I cancelled Claude: Token issues, declining quality, and poor support

5. Meta tells staff it will cut 10% of jobs

6. Google plans to invest up to $40B in Anthropic

7. US special forces soldier arrested after allegedly winning $400k on Maduro raid

8. Sabotaging projects by overthinking, scope creep, and structural diffing

📚 深读推荐

📖 相关推荐