ChatGPT's new image engine shows real promise, but still stumbles

OpenAI's latest image generation tool is live, and it's noticeably sharper than what came before. The new engine handles typography cleanly, pulls from the web, and reasons through complex requests. But a day of hands-on testing reveals that while the technology has matured in meaningful ways, it's not yet ready to replace human designers or photographers across the board.

ChatGPT Images 2.0 rolled out with support for multiple aspect ratios and two operating modes: a standard version available to all users, and a "thinking" mode with built-in reasoning reserved for paid subscribers. The distinction matters. Past image engines captured early enthusiasm only to hit walls that kept them out of serious business use. This release aims to clear some of those hurdles.

The tool excels at personal, sentimental work. Asked to create a memorial image of a recently deceased cat alongside two favorite toys, it produced something that felt like a genuine sympathy card, not a generic AI output. A photo restoration task also impressed: two wedding photos were seamlessly blended into an old-style album layout, complete with photo corners, that looked genuinely nostalgic rather than kitsch.

Creative tasks reveal the engine's flexibility. A fictional poster for a Mike Allen look-alike contest in Washington Square Park came together with visual humor intact. Trading cards made from photos of a person playing softball and a 13-year-old soccer player included extracted logos from uniforms, proper name plates, and positioning details that felt polished.

Even straightforward utility requests worked well. An infographic arguing against candy corn (a treat that is, technically, neither candy nor corn) was accurate and readable, though it failed to actually persuade anyone to abandon the Halloween staple. A bedroom visualization that stripped away clutter, gadgets, and scattered Legos showed real spatial reasoning, though one observer noted it mainly served as a tease about what could be rather than what would actually happen.

The weaknesses are real. When asked to mock up a fake newspaper using current Axios headlines, the engine first pulled old stories instead of fresh ones. A second attempt grabbed today's content but produced something that looked more like a rough mock-up than a finished publication. A mahjong cheat sheet came back accurate but aesthetically flat. The reasoning capability, meanwhile, comes with a tradeoff: images take notably longer to generate, especially in thinking mode.

The technology still has rough edges that will keep it out of professional workflows where speed and pixel-perfect execution matter. But for one-off creative projects, personal uses, and situations where approximate visual accuracy beats starting from scratch, the improvements are substantial enough to justify the hype.

Author James Rodriguez: "It's a solid step forward, but don't retire your graphic designer just yet."

Comments