2023 has felt like a yr devoted to synthetic intelligence and its ever-expanding capabilities, however the period of pure textual content output is already shedding steam. The AI scene is likely to be dominated by giants like ChatGPT and Google Bard, however a brand new giant language mannequin (LLM), NExT-GPT, is right here to shake issues up – providing the total bounty of textual content, picture, audio, and video output.
NExT-GPT is the brainchild of researchers from the Nationwide College of Singapore and Tsinghua College. Pitched as an ‘any-to-any’ system, NExT-GPT can settle for inputs in several codecs and ship responses based on the specified output in video, audio, picture, and textual content responses. This implies that you could put in a textual content immediate and NExT-GPT can course of that immediate right into a video, otherwise you may give it a picture and have that transformed to an audio output.
ChatGPT has solely simply introduced the functionality to ‘see, hear and converse’ which is analogous to what NExT-GPT is providing – however ChatGPT goes for a extra mobile-friendly model of this type of function, and is but to introduce video capabilities.
We’ve seen a variety of ChatGPT alternate options and rivals pop up over the previous yr, however NExT-GPT is among the few LLMs we’ve seen to date that may match the text-based output of ChatGPT but in addition present outputs past what OpenAI’s common chatbot can at the moment do. You may head over to the GitHub web page or the demo web page to attempt it out for your self.
So, what’s it like?
I’ve fiddled round with NExT-GPT on the demo web site and I’ve to say I’m impressed, however not blown away. In fact, this isn’t a cultured product that has some great benefits of public suggestions, a number of updates, and so forth – however it’s nonetheless superb.
I requested it to show a photograph of my cat Miso into a picture of him as a librarian, and I used to be fairly pleased with the end result. It might not be on the identical stage of high quality as established picture turbines like Midjourney or Secure Diffusion, but it surely was nonetheless an undeniably very cute image.
I additionally examined out the video and audio options, however that did not go fairly in addition to the picture technology. The movies that had been generated had been once more not terrible, however did have the very apparent ‘made by AI’ look that comes with a variety of generated photographs and movies, with every thing trying somewhat distorted and wonky. It was uncanny.
Total, there’s a variety of potential for this LLM to fill the audio and video gaps inside large AI names like OpenAI and Google. I do hope that as NExT-GPT will get higher and higher, we’ll be capable of see the next high quality of outputs and make some wonderful house motion pictures out of our cats seamlessly very quickly.