OpenAI launches GPT-4o picture era with improved textual content rendering and instruction following

Launched a few yr in the past, OpenAI’s GPT-4o has been refined and improved with new options. The newest is Picture Technology – the AI mannequin can generate high-quality, detailed photographs and may comply with your pure language directions to change them till you get simply the picture you had been picturing in your head.

You know the way older AI fashions struggled with textual content – in the event you ask them to generate an indication, at finest, you get an indication with gibberish phrases, at worst, you get squiggles that aren’t even letters. However test this out:

GPT-4o can create photographs with completely legible textual content

Picture era usually begins with coming into a textual content immediate, you then refine the picture by refining the unique immediate. GPT-4o works in a different way – you ask it for a picture, then inform it what to vary, then ask it to vary extra issues and so forth till you get your end result. Listed here are some examples:

Producing and modifying a picture by plain English

You possibly can comply with the Supply hyperlink under to look at the prompts that created these photographs. Be aware that OpenAI did some cherry choosing – a whole lot of the pictures are “finest of two” and even “better of 8”, so the mannequin wanted a couple of tries to get it proper. Nonetheless, the outcomes look fairly spectacular and the UI is so simple as it will get.

Right here is one other instance. GPT-4o can begin from scratch or it might modify a picture you give it. Right here, the consumer offers it a photograph of a cat and asks the AI to present it a detective hat and monocle. Then the consumer proceeds to refine the picture, turning it into one thing that may be a screenshot from an RPG.

Prototyping a cat detective RPG

You can begin with a number of photographs too and combine components from every picture into the ultimate end result. OpenAI says that GPT-4o is nice at following detailed directions – it might manipulate 10-20 totally different objects in a scene with out getting tripped up (different AI fashions can solely deal with 5-8 objects, says the corporate).

GPT-4o shouldn’t be excellent and OpenAI is the primary to confess it. Generally, it crops photographs off on the backside, hallucinations are nonetheless a problem, working with greater than 10-20 objects could be tough, rendering textual content with non-Latin characters wants work too and extra.

Examples of GPT-4o getting it fallacious

Lastly, listed here are some video demonstrations exhibiting off GPT-4o’s new picture era expertise:

Supply

Cookie	Duration	Description
cookielawinfo-checkbox-analytics		This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional		The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary		This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others		This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance		This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy		The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

OpenAI launches GPT-4o picture era with improved textual content rendering and instruction following

Snap will launch its AR glasses known as Specs subsequent 12 months, and these can be commercially accessible

Huawei Pura 80 Extremely debuts switchable tele lenses and 1″ predominant sensor

The brand new vivo T4 Extremely packs a 3x periscope and Dimensity 9300+ chipset

Sid Meier’s Civilization 7 replace 1.1.1 is right here and it lastly provides a setting that I’ve needed since day one

iPhone 16 to Launch in Indonesia on April 11 After Statewide Ban Lifted

iPhone 16 to Launch in Indonesia on April 11 After Statewide Ban Lifted

Leave a Reply Cancel reply

Categories

Recent Posts

OpenAI launches GPT-4o picture era with improved textual content rendering and instruction following

RelatedPosts

Snap will launch its AR glasses known as Specs subsequent 12 months, and these can be commercially accessible

Huawei Pura 80 Extremely debuts switchable tele lenses and 1″ predominant sensor

The brand new vivo T4 Extremely packs a 3x periscope and Dimensity 9300+ chipset

Sid Meier’s Civilization 7 replace 1.1.1 is right here and it lastly provides a setting that I’ve needed since day one

iPhone 16 to Launch in Indonesia on April 11 After Statewide Ban Lifted

iPhone 16 to Launch in Indonesia on April 11 After Statewide Ban Lifted

Leave a Reply Cancel reply

Categories

Recent Posts