In the exhilarating landscape of today’s technological advancements, we stand on the cusp of 2024 with heightened expectations, invigorated by the generative AI (artificial intelligence) surge of 2023. Anchoring this technological upheaval are advancements in AR (artificial reality) and VR (virtual reality), heralding a transformation in how we interact with our world and embodying the quintessence of the Fourth Industrial Revolution. Blending the bleeding-edge in computer vision, generative AI, and mixed reality, these AI-driven advancements are not just altering our present realities but also set to revolutionize our future in ways yet to be fully understood.
Metaverse 1.0: Limited Edition
Once inching forward, AI leapt into an era of explosive growth in 2023, marking a pivotal transition in its evolution. In like manner, AR and VR are extending beyond their gaming and entertainment roots into sectors like healthcare, education, and retail. Yet, they face significant challenges in achieving widespread adoption and while visually impressive, many AR/VR experiences often mirror the superficiality of video game graphics, lacking the depth, authenticity, and level of interaction required for more impactful applications. In healthcare, for instance, the limited simulation of human anatomy restricts training effectiveness, while in education, the lack of lifelike environments curtails immersive learning potential. With only 40% of users experiencing high immersion and realism in current AR/VR applications, we face a Gordian knot in crafting an immersive experience that is both realistic and highly effective.
Untying the Knot: The Dawn of Metaverse 2.0 with AI
A central hurdle in achieving true immersion in virtual spaces is the limitations of traditional computer graphics and the challenges in computer vision, particularly in producing realistic, scalable graphics and fully comprehending 3D environments from 2D data. The breakthrough, as foreseen by industry leaders, lies in harnessing AI to transition from mere capturing to imaginative creation. Pioneering work, like that of Hao Li’s group at Abu Dhabi’s Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) uses AI to create high-fidelity avatars from a single photo, potentially transforming virtual interactions. This method, employing generative AI, innovatively captures and replicates realistic human features and expressions. The capacity to accurately render a person’s distinct appearance and expressions in 3D from online photos signals a significant departure from traditional techniques, ushering in a new era of unparalleled realism in digital engagements.
The advancements in creating photorealistic digital avatars herald a new era in real-time communication and address the practical challenges of telepresence. As technology advances, we can expect a synergistic evolution of software and hardware, culminating in highly realistic, engaging experiences accessible through common devices like smartphones. Products like Microsoft’s HoloLens showcase the possibilities of AR in telepresence, offering virtual face-to-face interactions and “teleportation” to different settings. Similarly, Apple’s development with the Apple Vision Pro aims to elevate the realism of digital avatars, pushing the boundaries of true telepresence. With ongoing hardware innovations, coupled with the rapid progress of AI in generating rather than merely capturing, we are on the verge of a transformative breakthrough in delivering a photorealistic, immersive, and widely accessible virtual experience.
The Metaverse Center at the MBZUAI leading the charge
Guided by Hao Li, Associate Professor of Computer Vision and Director of the Metaverse Center, and Abdulmotaleb El Saddik, Professor of Computer Vision, the Metaverse Center at MBZUAI is spearheading the reinvention of digital interaction through collaboration with fellow captains of industry. Notably, Professor Li’s partnership with platforms like Netflix through Pinscreen is pioneering the realm of visual dubbing – employing advanced AI to perfectly align foreign actors’ lip movements with dubbed English audio, creating an experience virtually indistinguishable from actors speaking natively in English.
Expanding its impact beyond entertainment, MBZUAI is making strides in capturing large-scale environments and dynamic performances in real-time, of significant interest to industry giants like Google for ambitious projects such as the digital recreation of entire cities. This innovative approach, transcending traditional 3D scanning, utilizes deep neural networks for more accurate representations. In collaboration with Berkeley, Prof. Li’s work on real-time rendering technology using neural representations is pushing the boundaries of digital scene creation. The Metaverse Center’s exploration of generative AI for dynamic scene digitization is set to revolutionize user interactions, akin to navigating Google Street View as seamlessly as a video game, thus enhancing our virtual exploration of the real world in real-time.
The advent of this new virtual era echoes the nascent stages of smartphone proliferation, characterized by a narrowing digital divide through user-friendly design. This parallels our current exploration of AR/VR technologies, underscoring the importance of their ease of use and accessibility alongside their innovative features. As we edge towards 2024, our journey resembles passing through a technological looking glass, venturing into a domain where the lines between digital and physical realities increasingly converge. AI serves not only as a remedy for the existing limitations of AR/VR but also as a crucial driver for their broad acceptance, representing the most viable path to integrating these technologies seamlessly into our everyday lives.
This news is republished from another source. You can check the original article here