Successful language model evals — Jason Wei Everybody uses evaluation benchmarks (“evals”), but I think they deserve more attention than they are currently getting. Evals are incentives for the research community, and breakthroughs are often closely linked to a huge performance jump on some eval. I...
On Template-Based Feed Generation A couple of days ago, I was demoralized by a minor Jekyll issue. I was thinking about generating a single Atom feed from two content types. Turns out the plugin provided by GitHub Pages can’t do this directly.
A framework for thinking about team memory, joining up and serendipity in hybrid organisations Last week, I gave a keynote talk at Agile Manchester, based on a previous blog post. The talk was more detailed and had a new framing; this post summarises what I shared.
Bananagrams is NP-complete Have you ever wondered if Bananagrams is NP-complete? No? Well I think it is, and I’ll prove it!
Solodevs and the trap of the game engine Since I was 19 I’ve wanted to make my own video games, as in creating everything in the game myself, often referred to as being a solo developer, or solodev. But I was bad at committing to anything and I always thought that I lacked some specific skill su...
Thinking Big and Small Writing about the big beautiful mess that is making things for the world wide web.
New MacBook Setup This is mostly a reference for myself. I had to migrate to a new MacBook amidst a very busy week and I realized I have already memorized most of these steps, but this time I took notes. It took me about an hour. Next time will be even faster.
Creating a Video Game Wedding Invite with Adafruit EdgeBadge and PyBadge LC Nineteen years ago, my spouse-to-be and I met and became friends. During the pandemic, we got engaged and wanted to create a unique and memorable wedding invitation for our friends and family. Due …
Replacing pyinstaller with 100 lines of code A tale of how I accidentally stumbled upon some interesting tech over time.
:epic-handshake: Reorg half a seat to the left Handoffs suck. They’re the worst part of building products. But many organizations seem designed to maximize the number of handoffs on a project.
Supply Chain Data Maturity Spreadsheets Working with data in supply chain can be difficult. Many companies have system silos that fail to utilize cloud data warehouse services like Sno...
A Simple QR Based Food Ordering App Why During my stay at a hostel in Bangalore, I noticed the inconvenience of the existing food ordering system in the hostel. To order food, we had to call the cook and place our orders verbally or find the cook and place the order directly if you are new ...
What is a collision? From Mario bouncing off a Goomba to two cars bumping into each other in a racing game, dealing with collisions is such an integral part of most video games that we often take it for granted.
Setting the contents of a Windows Runtime Vector from C++/WinRT in one call - The Old New Thing The one-stop shop for updating a Windows Runtime Vector.
Enhancing Enum Handling in Spargine: Beyond Enums and into Versatility This content explains the usage of Enums in programming, cautioning against relying solely on Enum values for human-readable names from databases due to potential performance issues. It introduces …
Clean Architecture Sucks A brief conversation about the Clean Architecture approach and why some teams struggle with it.
Data Fetching Patterns in Single-Page Applications Five patterns to help Single Page Applications fetch data from remote sources
Developing cloud native apps with Aspire - Visual Studio Blog Introduce the general availability of .NET Aspire, a comprehensive stack aimed at simplifying the way .NET cloud-native apps are built and managed.
Attack Techniques: Full-Trust Script Downloads While it’s common to think of cyberattacks as being conducted by teams of elite cybercriminals leveraging the freshest 0-day attacks against victims’ PCs, the reality is far more mundan…
Attack Techniques: Remote Control Software In yesterday’s post, I outlined the two most successful (and stupid simple) attack techniques that you might not expect to work (and you’d be so very wrong): “Please give me your …
Entity Framework Core 8 provider for Firebird is ready 23 May 2024 1 mins .NET, C#, Databases in general, Entity Framework Core, Firebird, LINQ, SQL
We’re Ending Our Samsung Collaboration | iFixit News It’s not us, it’s you. It’s with a heavy wrench that we have decided to end our partnership with Samsung. Despite a huge amount of effort, Samsung’s approach to repairability does not align with our mission.
A Grand Unified Theory of the AI Hype Cycle I’m sorry, but as an AI language model, I cannot repeat history exactly. However, I can rhyme with it.