/u/Efistoffeles – Jay van Zyl @ ecosystem.Ai

Re-evaluating MedQA: Why Current Benchmarks Overstate AI Diagnostic Skills

/u/Efistoffeles May 12, 2025 May 12, 2025

I recently ran a research and an evaluation of top LLMs on the MedQA dataset (Vals.ai, 09 May 2025). Normally these tests are multiple-choice questions plus five answer choices (A–E). They show the following: – o1 96.5 %, – o3 96.1 %, – o4 Mini 9…

Prompt checker for enhancing I created with Claude in 12 hours.

/u/Efistoffeles March 17, 2025 March 17, 2025

submitted by /u/Efistoffeles [link] [comments]

AI Users! This is how to export all your AI Chats to store them in a JSON file locally!

/u/Efistoffeles May 27, 2024 May 27, 2024

submitted by /u/Efistoffeles [link] [comments]

Google released this video yesterday. As much as the show was a joke, this is actually amazing.

/u/Efistoffeles May 22, 2024 May 22, 2024

submitted by /u/Efistoffeles [link] [comments]

Microsoft announced Copilot+ and this is what it does… in Minecraft.

/u/Efistoffeles May 21, 2024 May 21, 2024

submitted by /u/Efistoffeles [link] [comments]

Copilot has suddently a limit of 5 messages per Chat…

/u/Efistoffeles May 18, 2024 May 18, 2024

submitted by /u/Efistoffeles [link] [comments]

Connecting any version of GPT straight to Gemini is now possible.

/u/Efistoffeles May 16, 2024 May 16, 2024

submitted by /u/Efistoffeles [link] [comments]

Freepik just released their collaboration with Magnific. It can create endless zoom-in images and it’s looking Crazy!

/u/Efistoffeles May 15, 2024 May 15, 2024

submitted by /u/Efistoffeles [link] [comments]

Gemini has now 2 milion tokens

/u/Efistoffeles May 14, 2024 May 14, 2024

submitted by /u/Efistoffeles [link] [comments]

Claude is finally oficially available in Europe!

/u/Efistoffeles May 14, 2024 May 14, 2024

submitted by /u/Efistoffeles [link] [comments]