Natural Language Processing (NLP) is one area where Large transformer-based Language Models (LLMs) have achieved remarkable progress in recent years. Also, LLMs are branching out into other fields, like robotics, audio, and medicine.
Modern approaches allow LLMs to produce visual data using specialized modules like VQ-VAE and VQ-GAN, which convert continuous visual pixels into discrete…
