On Tuesday, Elon Musk’s AI firm xAI introduced the beta launch of two new language fashions, Grok-2 and Grok-2 mini, out there to subscribers of his social media platform X (previously Twitter). The fashions are additionally linked to the lately launched Flux picture synthesis mannequin, which permits X customers to create largely uncensored photorealistic photographs that may be shared on the location.
“Flux, accessible via Grok, is a wonderful text-to-image generator, however it’s also actually good at creating faux pictures of actual places and folks, and sending them proper to Twitter,” wrote frequent AI commentator Ethan Mollick on X. “Does anybody know if they’re watermarking these in any means? It will be a good suggestion.”
In a report posted earlier right now, The Verge famous that Grok’s picture era capabilities seem to have minimal safeguards, permitting customers to create probably controversial content material. In line with their testing, Grok produced photographs depicting political figures in compromising conditions, copyrighted characters, and scenes of violence when prompted.
The Verge discovered that whereas Grok claims to have sure limitations, similar to avoiding pornographic or excessively violent content material, these guidelines appear inconsistent in apply. In contrast to different main AI picture turbines, Grok doesn’t seem to refuse prompts involving actual folks or add figuring out watermarks to its outputs.
Given what persons are producing up to now—together with photographs of Donald Trump and Kamala Harris kissing or giving a thumbs-up on the best way to the Twin Towers in an obvious 9/11 assault—the unrestricted outputs might not final for lengthy. However then once more, Elon Musk has made an enormous deal out of “freedom of speech” on his platform, so maybe the potential will stay (till somebody seemingly recordsdata a defamation or copyright go well with).
Folks utilizing Grok’s picture generator to shock brings up an outdated query in AI at this level: Ought to misuse of an AI picture generator be the duty of the one who creates the immediate, the group that created the AI mannequin, or the platform that hosts the pictures? To this point, there is no such thing as a clear consensus, and the state of affairs has but to be resolved legally, though a brand new proposed US legislation known as the NO FAKES act would presumably maintain X answerable for the creation of practical picture deepfakes.
With Grok-2, the GPT-4 ceiling nonetheless holds
Trying past photographs, in a launch weblog, xAI claims that Grok-2 and Grok-2 mini symbolize vital developments in capabilities, with Grok-2 supposedly outperforming some main rivals in current benchmarks and what we name “vibemarks.” It is at all times clever to strategy these claims with a dose of skepticism, however it seems that the “GPT-4 class” of AI language fashions (these with comparable functionality to OpenAI’s mannequin) has grown bigger, however the GPT-4 barrier has not but been smashed.
“There at the moment are 5 GPT-4 class fashions: GPT-4o, Claude 3.5, Gemini 1.5, Llama 3.1, and now Grok 2,” wrote Ethan Mollick on X. “The entire labs are saying there’s room left for continued big enhancements, however we haven’t seen any fashions actually leap above GPT-4… but.”
xAI says it lately launched an early model of Grok-2 to the LMSYS Chatbot Enviornment below the title “sus-column-r,” the place it reportedly achieved the next total Elo rating than fashions like Claude 3.5 Sonnet and GPT-4 Turbo. Chatbot Enviornment is a well-liked subjective vibemarking web site for AI fashions, nevertheless it’s been the topic of controversy lately when folks disagreed with OpenAI’s GPT-4o mini inserting so extremely within the rankings.
In line with xAI, each new Grok fashions present enhancements over predecessor Grok-1.5 in areas like graduate-level science data, common data, and math problem-solving in benchmarks which have equally proved controversial. The corporate additionally highlighted Grok-2’s efficiency on visible duties, claiming state-of-the-art ends in visible math reasoning and document-based query answering.
The fashions at the moment are out there to X Premium and Premium+ subscribers via an up to date app interface. In contrast to a few of its rivals within the open weights house, xAI is not releasing the mannequin weights for obtain or unbiased verification. This closed strategy stands in stark distinction to current strikes by Meta, which lately launched its Llama 3.1 405B mannequin for anybody to obtain and run domestically.
xAI plans to launch each fashions via an enterprise API later this month. The corporate says this API will function multi-region deployment choices and safety measures like necessary multifactor authentication. Particulars on pricing, utilization limits, or knowledge dealing with insurance policies haven’t but been introduced.
Photorealistic picture era apart, maybe Grok-2’s greatest legal responsibility is its deep hyperlink to X, which provides it a bent to tug inaccurate data from tweets. It’s kind of like should you had a pal who insisted on checking the social media web site earlier than answering any of your questions, even when it wasn’t notably related.
As Mollick identified on X, this shut hyperlink will be annoying: “I solely have entry to Grok 2 mini proper now, and it looks like a stable mannequin, however typically appears ill-served by its RAG connection to Twitter,” he wrote. “The mannequin is fed outcomes from Twitter that appear irrelevant to the immediate, after which desperately tries to attach them into one thing coherent.”
On Tuesday, Elon Musk’s AI firm xAI introduced the beta launch of two new language fashions, Grok-2 and Grok-2 mini, out there to subscribers of his social media platform X (previously Twitter). The fashions are additionally linked to the lately launched Flux picture synthesis mannequin, which permits X customers to create largely uncensored photorealistic photographs that may be shared on the location.
“Flux, accessible via Grok, is a wonderful text-to-image generator, however it’s also actually good at creating faux pictures of actual places and folks, and sending them proper to Twitter,” wrote frequent AI commentator Ethan Mollick on X. “Does anybody know if they’re watermarking these in any means? It will be a good suggestion.”
In a report posted earlier right now, The Verge famous that Grok’s picture era capabilities seem to have minimal safeguards, permitting customers to create probably controversial content material. In line with their testing, Grok produced photographs depicting political figures in compromising conditions, copyrighted characters, and scenes of violence when prompted.
The Verge discovered that whereas Grok claims to have sure limitations, similar to avoiding pornographic or excessively violent content material, these guidelines appear inconsistent in apply. In contrast to different main AI picture turbines, Grok doesn’t seem to refuse prompts involving actual folks or add figuring out watermarks to its outputs.
Given what persons are producing up to now—together with photographs of Donald Trump and Kamala Harris kissing or giving a thumbs-up on the best way to the Twin Towers in an obvious 9/11 assault—the unrestricted outputs might not final for lengthy. However then once more, Elon Musk has made an enormous deal out of “freedom of speech” on his platform, so maybe the potential will stay (till somebody seemingly recordsdata a defamation or copyright go well with).
Folks utilizing Grok’s picture generator to shock brings up an outdated query in AI at this level: Ought to misuse of an AI picture generator be the duty of the one who creates the immediate, the group that created the AI mannequin, or the platform that hosts the pictures? To this point, there is no such thing as a clear consensus, and the state of affairs has but to be resolved legally, though a brand new proposed US legislation known as the NO FAKES act would presumably maintain X answerable for the creation of practical picture deepfakes.
With Grok-2, the GPT-4 ceiling nonetheless holds
Trying past photographs, in a launch weblog, xAI claims that Grok-2 and Grok-2 mini symbolize vital developments in capabilities, with Grok-2 supposedly outperforming some main rivals in current benchmarks and what we name “vibemarks.” It is at all times clever to strategy these claims with a dose of skepticism, however it seems that the “GPT-4 class” of AI language fashions (these with comparable functionality to OpenAI’s mannequin) has grown bigger, however the GPT-4 barrier has not but been smashed.
“There at the moment are 5 GPT-4 class fashions: GPT-4o, Claude 3.5, Gemini 1.5, Llama 3.1, and now Grok 2,” wrote Ethan Mollick on X. “The entire labs are saying there’s room left for continued big enhancements, however we haven’t seen any fashions actually leap above GPT-4… but.”
xAI says it lately launched an early model of Grok-2 to the LMSYS Chatbot Enviornment below the title “sus-column-r,” the place it reportedly achieved the next total Elo rating than fashions like Claude 3.5 Sonnet and GPT-4 Turbo. Chatbot Enviornment is a well-liked subjective vibemarking web site for AI fashions, nevertheless it’s been the topic of controversy lately when folks disagreed with OpenAI’s GPT-4o mini inserting so extremely within the rankings.
In line with xAI, each new Grok fashions present enhancements over predecessor Grok-1.5 in areas like graduate-level science data, common data, and math problem-solving in benchmarks which have equally proved controversial. The corporate additionally highlighted Grok-2’s efficiency on visible duties, claiming state-of-the-art ends in visible math reasoning and document-based query answering.
The fashions at the moment are out there to X Premium and Premium+ subscribers via an up to date app interface. In contrast to a few of its rivals within the open weights house, xAI is not releasing the mannequin weights for obtain or unbiased verification. This closed strategy stands in stark distinction to current strikes by Meta, which lately launched its Llama 3.1 405B mannequin for anybody to obtain and run domestically.
xAI plans to launch each fashions via an enterprise API later this month. The corporate says this API will function multi-region deployment choices and safety measures like necessary multifactor authentication. Particulars on pricing, utilization limits, or knowledge dealing with insurance policies haven’t but been introduced.
Photorealistic picture era apart, maybe Grok-2’s greatest legal responsibility is its deep hyperlink to X, which provides it a bent to tug inaccurate data from tweets. It’s kind of like should you had a pal who insisted on checking the social media web site earlier than answering any of your questions, even when it wasn’t notably related.
As Mollick identified on X, this shut hyperlink will be annoying: “I solely have entry to Grok 2 mini proper now, and it looks like a stable mannequin, however typically appears ill-served by its RAG connection to Twitter,” he wrote. “The mannequin is fed outcomes from Twitter that appear irrelevant to the immediate, after which desperately tries to attach them into one thing coherent.”