Extended Reality (XR) is an umbrella term that encompasses Augmented Reality (AR), Virtual Reality (VR), and Mixed Reality (MR), providing users with immersive and interactive digital experiences. XR technologies have revolutionized the way we interact with the physical and digital worlds.

Generative AI (Gen AI) refers to a class of Artificial Intelligence techniques that focus on generating different types of content, including text, images, audio, video, 3D models, etc. The Gen AI models are designed to learn patterns and underlying structures from existing data and utilize that knowledge to generate new content.

The powerful synergy between Gen AI and XR holds tremendous potential for innovation with a transformative impact on immersive experiences. By leveraging Gen AI to develop high quality XR content, we can unlock new frontiers of innovation, enabling realistic, adaptive, and interactive experiences that push the boundaries of human imagination.

Generative AI for XR productivity

Different Gen AI models help generate XR content across multiple modalities – text, image, audio, video, and 3D. We use most of them in XRGen – our Gen AI-enabled Intelligent XR Solution accelerator.

  • Text generation models: Different models like GPT models help generate text using natural language prompts. Different language models like BERT and RoBERTa, can be fine-tuned for specific tasks like language translation, sentiment analysis and text generation. Also, text embedding models can be used to retrieve contextual textual data from different data sources.
  • Image generation models: Deep learning, text-to-image models are used to generate detailed photo-realistic images based on text descriptions. There are different models like Open AI’s Dall-E-2, Stability AI’s Stable diffusion,  Midjourney and ControlNet.
  • Audio generation models: These Machine Learning models take input as natural language description and produce audio matching that description. Text-to-speech models like ElevenLabs, VALL-E neural codec language model or NaturalSpeech 2 enable controllable voice cloning and generation models.
  • Video generation models: These machine learning models take input as natural language description text prompts, image or video and produce video  accordingly. There are models like Runway’s Gen 2 and Gen 1 providing video outputs.
  • Generative AI-powered tools for code generation: GitHub Copilot, Amazon CodeWhisperer and Amazon CodeGuru can be used for different code generation and code review functions which expedite building XR applications. Various XR tools and platforms like Blender and Unity can now utilize these different Gen AI tools to generate different code snippets, helping XR developers expedite the experience building XR applications.
  • NeRF (Neural Radiance Field): One of the key challenges in XR content creation is creating realistic and immersive environments and 3D content, the Neural Radiance Fields (NeRFs) help create these. NeRFs have the potential to generate photo-realistic 3D environments and content enabling incredible possibilities for XR experiences. NeRFs have the ability to see reflections and transparencies like never before. It recognizes where light rays end in images. Based on the input as captured images, the system can automatically create a 3D scene from multiple photos of the same 2D scene. We will dive into this model in detail in our NeRF Specific blog.

The recent announcement from Unity on Generative AI enabled products has opened up new avenues of creativity and innovation.

  • Unity Muse: It is an expansive platform for AI-driven assistance during creation. It enables creators to develop real-time 3D experiences with simple text-based prompts.
  • Unity Sentis: Allows embedding Neural networks for your builds to enable previously unimaginable real-time experiences. Easily enables AI models in the Unity Runtime, thus, enhancing the app on user devices.
Applications of Generative AI in XR

Generative AI has emerged as game-changing technology revolutionizing industries, and combined with XR, it expands the possibilities for building compelling, immersive experiences even further.

At Persistent, we are using Generative AI for XR applications in the following ways:

  • Content Creation
    As seen above, there are Gen AI models which help in creating and generating content for –
    • Textures and Materials: They can help create realistic and dynamic textures and materials for 3D objects. They help in visually captivating experiences by enhancing the visual quality of overall XR environments.
    • Objects: They can help create virtual objects, different props, and designs for XR applications. This capability accelerates the process of content creation, expanding the range of available assets by creating rich and unique assets.
    • Environment creation: They can help create realistic, high-quality and high-fidelity environments for XR applications which can be dynamically generated, allowing limitless variations, thus, facilitating compelling immersive experiences.
  • Contextual Personalization and Customization
    Gen AI allows us to customize XR applications in different ways –
    • Environment customization: Along with XR designers, it also empowers users to personalize and customize XR environments by generating unique layouts and designs. Some Gen AI models also help control this creation by using a baseline.
    • Avatars: Gen AI can also help create personalized avatar representations resembling the user, adapting to their facial features, expressions and body movements. This helps in user identification in XR worlds and experiences, thus, enhancing immersion in XR environments.
  • Building different Interactions
    Gen AI enables simulations of object interactions and also user interactions, thus, enhancing the sense of immersion and presence. These can be achieved as –
    • Human-like communication: Gen AI models can help enhance the interactions of users within XR experiences with bot assistants, providing a human-like feel to the conversation. Open AI models like GPT models can understand and generate natural responses.
    • Object simulations: Generative AI models can help simulate physics-based interactions enabling virtual objects to respond realistically to use actions. This adds to the level of authenticity, thus, making XR experiences more engaging and immersive.
XRGen – Persistent’s Generative AI-enabled Intelligent XR Solution accelerator

We have built an XR content creation pipeline that leverages the power of Gen AI models to accelerate the XR experience-building process.

With the help of Gen AI models, we have streamlined the XR experience building for each type of content in AR, MR and VR applications –

  • Text & Interactions: Leveraging Microsoft Azure Open AI’s Embedding models – “text-embedding-ada-002” & GPT models – “gpt-3.5-turbo” and “GPT-4” for retrieving contextual information for the XR experiences. Also, used GPT models for enabling conversational interactions, personalization and customization in the applications.
  • Images & Textures: Utilized Stable diffusion’s ControlNet for contextual 3D Model textures and other images used in the XR environments.
  • 3D models: Used NeRF based models for generating 3D models in various XR experiences.

The content creation is contextual, based on the use case requirement and type of content. This expedites the process of XR development, thus, reducing production time and costs.

Currently, this pipeline focuses on creating content for XR applications across multiple domains like Healthcare, Telecommunication, Manufacturing, etc., such as –

  • Holographic MR experiences for operations, troubleshooting and maintenance of different equipment, devices and complex workflows
  • AR experiences for Product experience, Field Service Management
  • AR navigation
  • Building different VR experiences for simulations and collaboration 

With our expertise in Generative AI, XR technologies and building different AR, VR and MR solutions, we help organizations build customized, intuitive, and immersive experiences. We provide consultancy, designing, development services and support to our clients for building AR, MR, and VR solutions.

Given their relevance to the physical world and real-time experiences,  XR applications are becoming more widely adopted across various industries. We predict that the use of Generative AI in XR will have a significant impact on immersive experiences, while also reducing project costs and improving efficiency and productivity.

If you want to explore more about our Generative AI in XR offerings, please reach out to us.