Foreword: Games are the forefront of technology application and innovation
In 1975, the U.S. Department of Defense presented to MIT a number of medium-sized computers PDP-10 with a unit price of US$400,000, which were originally used in equipment for the development of simulated weapons, but a video game called adventure was more popular with students. ; In 2021, Nvidia’s top-level graphics cards are still hard to find. These hardware originally prepared for gamers and designers are widely used in artificial intelligence and blockchain computing due to multiple performance advantages such as ultra-high floating-point computing capabilities; On May 26, the Unreal Engine 5 experience version was released. Features such as Nanite virtual geometry, Lumen real-time global illumination and MetaSounds have achieved cross-generational improvements in realism and details for next-generation games, real-time visualization and immersive interactive experiences.
If you dive deep into the content and social level of game understanding, you will see its more “hard-core” side-games are not only a medium of creation and expression, a space for social interaction, but also the application of technology and innovation. cutting edge. Common technologies and tools in game research and development technology are empowering multiple fields such as film and television, simulation, and industry. New technologies such as AI cloud computing use games as in-depth application scenarios. New hardware and new interactions bring multiple breakthroughs in computing power and experience. In the future, what technological breakthroughs will the game develop towards, and how will it help the “super digital scene” accelerate the integration of virtual and reality and create more value? The series of articles will try to interpret the following aspects from the perspective of key technologies and development paths.
- Game Engine
- Virtual person
- Cloud gaming
- Mixed reality equipment
- AI-assisted content generation
Super digital scene: soak the dehydrated Internet space again
In the future imagination presented by science fiction works, cyberspace will become more and more scene-oriented, becoming a “super digital scene” integrating life, trading and entertainment. For example, the Matrix world in “The Matrix” is no different from the real world, and behind the VR headset of “Ready Player One” is a whimsical world composed of multiple surreal scenes.
But at the moment, when we click on a video link, we still look forward to the content itself, rather than imagining that we are walking into a virtual cinema; the barrage enhances the sense of presence, but it is still difficult to reproduce the movie-watching experience in life. From the tangible reality scene to the Internet scene, hypertext has changed the traditional time-space relationship, and we jump through the “portal” between links. The scene has become an abstract concept in the Internet, which often no longer means specific facilities, environments and experiences.
Cafes, galleries, art galleries, gyms, etc., are not purely physical facilities, but carry public activities and cultural connotations, thereby changing people’s attitudes and behaviors, affecting social life, and redefining urban scenes (Urban Scenes) . This is also one of the main points of the sociological work “Scene: How Spatial Quality Shapes Social Life”.
The super digital scene uses a super-large-scale, real-time interactive virtual world to unfold the folded Internet in a 3D space, conveying credible and immersive information on multiple levels of vision and hearing. Here not only reproduces daily life, but also realizes the fantasy world, and provides more choices. For example, you can transmit instantaneously between any two locations, and the concert will not start until you “sit down”. It is not difficult to constantly change your appearance (even gender). Although the full-featured brain-computer interface and full-sensing immersive technology are still away from us, games such as “Red Dead Redemption 2” and “Cyberpunk 2077” actually show the embryonic form of the future.
The game engine is the “dream factory” of the super digital scene
The essence of constructing a “super digital scene” is the abstraction, refinement and reproduction of the real world, which requires multiple abilities: measurement of reality and redrawing in the digital world, constructing digital avatars of “humans” with realistic appearance and movements. Coordinate resources such as sound, picture, and action, and build the operating rules of the new world. At the same time, this ability must also be provided to creators at all levels in an open, tool-based, and easy-to-use form. At the moment, the closest thing to this capability is the game engine and related development tools.
The game engine is a series of work suites to improve the efficiency of game development, and manage the screen performance and interaction logic of the game. World-renowned game engines include EPIC’s unreal series, unity series, CryEngine and some game companies’ self-developed engines, such as EA’s Frost, Ubisoft’s Anvil, CDPR’s REDENGINE, NetEase’s NeoX, Messiah, and Tencent’s Quicksilver.
From the perspective of workflow, the game engine belongs to middleware. The construction of art resources needs to be completed in advance in other software, such as 3DMax, Maya, ZBRUSH making models, PS drawing textures, Houdini making scenes, motion capture to obtain motion information, etc., and importing the engine for integration. Music and sound effect materials are also produced by other tools outside the engine. With the increase in the application fields of game engines, the connection with industrial software, simulation applications, and film and television production tools has become closer.
“Creating the world” is not easy. The game engine simulates the internal and external forms of the world as much as possible, sets basic physics rules, and defines interaction methods. The game engine is more like a giant studio with complete equipment and advanced technology. After the actors, scenery, lighting, cameras, and guide rails are arranged in sequence, how to coordinate the operation of the picture, movement, and interaction is the job of the creator and the engine. The amount of information in the real world is approaching infinite, which requires abstraction and simplification. Under the idea of real-time computing, prioritizing efficiency, and pursuing quality, game engines and related tools solve three core problems: rendering, physics, and motion.
Rendering: draw ultra-realistic computer images
When you are reading this line of text, the computer, a set of protocols, and software are working together to perform a magic called “render” to make the code and data a readable image on the display. In movies and games, the computer also draws the original data information into frames of images. The rendering scale is larger, the tasks are more, and the final effect is more complex and gorgeous. Its main task is to convert the 3D model into a plane image in front of the camera, and to restore the color of each pixel as accurately as possible.
The cyberpunk world is misty and misty, and the neon reflects the “colorful black” in the eyes of the characters. Behind this is the ten-year tireless effort of computer graphics and engineering. When it was established in 1974, ACM SIGGRAPH (Special Interest Group for Computer Graphics of the American Computer Society) was only a group composed of a few subject experts. Now it has developed into an international group composed of international researchers, artists, developers, filmmakers, and industry professionals. In the community, the top academic conferences of SIGGRAPH and SIGGRAPH ASIA are held every year to release forward-looking graphics and interactive technology innovations.
Rendering may mean a huge amount of calculation, and it is difficult to balance effect and efficiency. Currently, rendering is divided into offline (off-line) rendering commonly used in movies and television, and real-time rendering used in games.
In order to achieve the subtle and realistic effects, the film and television can quietly wait for the rendering of the screen to be generated, and the large computing cluster used is vividly called the “farm.” In “Battle Angel Alita” produced by Weta Digital, each of the protagonist’s hair is an “object” that needs to be rendered separately, with a total of more than 130,000. It takes 100 hours to render a single frame, 30,000 computers work day and night, and the heat emitted makes the temperature in Wellington rise.
The game needs to react to the player’s input in real time. In shooting games, the master’s victory is often between a few frames; at the same time, the computing power of the player’s hardware is stretched. Therefore, efficiency is put in the first place, and visual subtraction, optimization and even deception are necessary. For example, the occluded objects are eliminated without calculation; the more refined, higher-face number model is rendered well, and the light and shadow texture map is extracted and placed on the low-face number model. The diffuse reflection, highlight, and normal map technology. The blessings of computer graphics, algorithms and engineering join this “extreme challenge” together with the game engine as a tool.
The improvement of real-time rendering effects depends on the joint advancement of algorithms and computing power. In a 3D scene, how to accurately represent the current color of each pixel in a polygon? This must consider its shape, color, light and material. After global illumination (GI) is turned on, the dark areas in the picture that were originally not directly illuminated are also brightened by the indirect lighting caused by diffuse reflection, and the details and realism are greatly improved.
The development of GPU professional graphics hardware has greatly improved the rendering, especially the performance of light and shadow. For example, the ray tracing (ray tracing, referred to as light tracing) algorithm framework proposed in the late 1970s needs to calculate the reflection and refraction of each ray emitted by the viewpoint on the surface of the object to relatively accurately restore the projection, mirror, and refraction optics. effect. However, due to the huge amount of calculation, it was not until 2018 that Nvidia released RTX GPU and Microsoft launched DXR API that real-time ray tracing at the hardware level became possible. After the top consumer graphics card 2080ti turns on ray tracing, the game screen will quickly drop to 30 frames or even lower.
Physics: anchoring the laws of the new world
Digital 3D models have at least two composition logics: draw a hollow outer surface with vertices and polygons (Polygon), and use Lego-like voxels (Voxel) to build entities from the inside out. The latter is closer to the composition logic of the physical world. In 2009, “Minecraft” achieved unprecedented success. It is very similar to a highly encapsulated voxel physics engine. Its construction, assembly, destruction and other methods enrich the players and the digital world. Interactive. However, the huge number of particles requires too much real-time computing, and the performance of materials and details is not good, so mainstream game production still uses the polygon mode. UE5’s Nanite technology enables hundreds of millions of triangles to exist in the screen at the same time, achieving film- and television-level lifelike effects.
In reality, smashing a large stone will result in several small stones, and when two stones collide, they will bounce; in an engine lacking physical support, decomposing the stone model can only get several broken curved surfaces, and two stones close together. Will cross each other. To express collisions of objects, accumulation of rain, fluttering hair, and broken buildings, you need the support of the physical characteristics of the engine. World-renowned physics engines include AMD’s Bullet, Nvidia’s PhysX, and Microsoft’s Havok. The physics engine uses generalized and simplified rules to cleverly achieve realistic effects.
The objects processed by the physics engine can be divided into two parts: Rigid Bodies and Soft Bodies. Whether the spherical model in the three-dimensional space is a bubble or a shot ball depends on the mass, resistance, and gravity settings of the rigid body components; if the shot ball wants to “fall on” instead of passing through the floor, it needs to set a collider to stop the movement when the two touch. Guofeng’s martial arts and fantasy themes pay attention to “clothes fluttering”, and have accumulated a lot of experience in the soft body effect of cloth and hair: a way to achieve this is similar to making an umbrella, binding bones to the cloth, and changing the movement of the cloth by pulling the bones ; Or use the mass point-spring mode to simulate some mechanical properties of the cloth when it is stretched and folded.
Specific physical parameters can be set in the engine, such as the size of the friction coefficient; the operating law of the virtual scene can also be changed. For example, make the gravity in the scene consistent with reality, or set it to “the direction of gravity can be changed by the player’s operation” (the game “Gravity Fantastic World”). With the support of the physics engine, some real scenes can be simulated in a virtual environment, such as “pull a car with a spring and a ball on a slope” in a high school textbook, demonstrate Archimedes’s law, or use reinforcement learning to train one Only simulated dogs learn to walk.
Action: Drive more natural character behavior
In many aspects of game art, motion adjustment is an out-and-out hard work. It requires the designer to manually adjust the position of each joint in the key frame of the photo or video. When you can’t find the feeling, you can pull your colleague or go into battle in person to feel the direction of the bones and the force of muscles. To make the character lively and lifelike, the realism and richness of the action are indispensable.
Motion capture technology can efficiently obtain a large amount of motion data, and significantly improve the realism, and has been maturely applied in film and television special effects and game production. Around 2015, the difficulty of using motion capture decreased, and the time required for site calibration, bone modeling, and motion review was greatly shortened; accuracy increased, supporting finger motion capture. 3A games such as “God of War 4” will simultaneously record the actors’ body movements, facial expressions, camera movements and pre-recording when they are producing film and television-level cutscenes.
Collecting enough action clips is the first step. How to make the real connection between action and action was once a difficult point. The violent solution is to make a cohesive animation between every two independent actions. When the number of basic actions is 100, you need to add at most 100*99/2=4950 new animations. The Motion Matching technology used by Ubisoft in “For Honor” is based on a large number of continuous motion data. According to the player’s input and the status of the character, it automatically searches for the action that best matches the state of the future character and plays it. , Save workload and make the action continuous and natural.
Under the blessing of AI, the future will be from playing a predetermined animation to learning complex actions independently by the agent. Ubisoft will publish its latest achievement “LearnedMotion Matching” (https://montreal.ubisoft.com/en/learned-motion-matching-2/) at SIGG in 2020. Using neural network and deep learning technology, AI can abstract motion and The logic between environments, learn to combine complex actions such as climbing uphills, sitting on chairs, switch actions more naturally, and effectively integrate with the virtual environment, further reducing training time and computational consumption, and increasing the interpretability of learning. Universality.
More than games: let the virtual and the real closely combine
The effect of the engine’s real-time rendering is subtle and immersive, mixed with the real scene, enough to be fake; the accumulation of physical characteristics and actions can support simple simulation applications. Therefore, some commercial engines have begun to explore how to open up these capabilities to more industries and provide more tailored tools, which are gradually applied to a variety of scenarios that require high-fidelity, real-time presentation, such as architecture, film and television, automobiles, live broadcast, and simulation. Training, industrial manufacturing, advertising, etc.
In the past film and television productions, green screen shooting and post-compositing were carried out in sequence. Due to the separation of shooting and production, everyone in the team faces challenges in their work: actors can only imagine scene performances, the integration of people and scenes requires a lot of post-production work, the director cannot see the shooting effects in real time, and the props and scene masters cannot be on site. Adjust the scenery in time.
Virtual production is changing the production process in real-time mode, and the game engine is helping the scene to be more realistic. Multi-faceted real-time LED curtain walls are set up on the set, and the blue sky and white clouds and the surrounding scenery are truly reflected on the reflective materials such as the actors’ sunglasses and leather clothes. The mountains, clouds and fog in the scenery can also be modified on-site if they are not effective in the shot. The overall efficiency of combining virtual and real shooting is improved, bringing more inspiration and creative space for artists.
The more than half of the first season of “The Mandalorian” was shot in this way: 4 synchronized PCs running Unreal Engine, driving the LED curtain wall in real time to play pre-made high-precision art materials. On the other hand, three unreal operators can simultaneously manipulate the virtual scene, lighting and effects on the curtain wall. The staff in the LED curtain wall can also use iPad to remotely control the scene and work side-by-side with the director. While saving the cost of shooting on location, you can use the accurate lighting and reflections in the camera to shoot a large number of complex visual effects shots, and jointly iterate the scene in real time on the set. Many film and television works such as “Western World” and “Game of Thrones” also use the virtual production mode.
The game engine is a real-time 3D creation platform, and its technology is widely used by the automotive industry in design and engineering verification, interior and exterior decoration effects and HMI (human-computer interaction), marketing and terminal sales, simulation training and autonomous driving.
We often see cool car appearance displays in various advertisements. The body glass, metal and paint flash with different luster under the light, which outlines a smooth and vivid car body curve. The game engine can render detailed images in real time, helping designers adjust the shape, color, material and details of the car at any time, and adding complex lighting to enhance the atmosphere, providing support for video advertisements. Pixel Streaming technology can display the model directly on the web page for users to rotate, zoom and watch at any time through any terminal.
Autonomous driving requires a large amount of tests and data to improve the efficiency and reliability of the algorithm, which is difficult to meet with traditional road tests. According to research conducted by the American RAND Corporation, if an autonomous driving algorithm wants to reach the level of a human driver, at least 17.7 billion kilometers of driving data must be accumulated to perfect the algorithm. Traditional road testing is costly, has certain safety hazards, and cannot fully display many unexpected situations in reality. What kind of car can run on axles 24 hours a day without a drop of oil, and what venue can simulate extreme weather, electric vehicles changing lanes and congestion, pedestrians walking out of blind spots and other emergencies in one second, without endangering actual road safety?
The autonomous driving simulation system came into being, and the rendering and physics capabilities of the game engine played an important role. For example, Tencent TAD Sim introduced UE4 to get better environment simulation results and more accurate sensor simulation results. The rendering capabilities of the game engine can produce very realistic light and shadow effects for simulated scenes, as well as changes in weather conditions such as wind, frost, rain, and snow. At the same time, with the power of the physics engine, it is possible to define a set of physical rules for the simulation world that are the same as the real world, so that changes in lighting conditions and weather conditions, including the behavior of various traffic elements, have the same impact on the simulation scene as the real world. This is very important for the training of vehicle perception algorithms and control algorithms. At present, there are more than 1,000 scene types in the scene library, which can be generalized to generate more than 10,000 times richer scenes, and have the test capability of more than 10 million kilometers per day.
What surprises will the game engine bring in the next ten years
Creating a world can be very simple, a piece of paper, a pen, and unconstrained imagination; creating a world can be very complicated, and it is so hard that God needs a day off. If the existing game engines are to create an “oasis” among the top players, they are far from each other in terms of capabilities and functions. But with the blessing of hardware computing power, the tireless efforts of computer graphics scientists and engineers, and the ingenious hands of artists and designers, the engine is accelerating its development toward a more powerful, easier-to-use, and more versatile direction. In the future, the game engine will make breakthroughs in the following aspects:
More powerful all-round expressiveness, enhance immersion-including real-time film and television-level realism and rich details on the screen, more realistic lighting, particle special effects; physical simulation methods that are more similar to the real world; more sound Accurately restore the sound field and deeply integrate with the scene.
Easier and more versatile—Leave complex technologies under the iceberg, allowing more ordinary creators to achieve high-end effects with simple operations, and achieve functional combinations for different scenarios, needs and users; cross-platform capabilities, right Better support for new hardware and platforms such as VRARMR; explore native cloud game engine capabilities, make full use of GPU cluster rendering capabilities, and explore high-quality games that break through the computing power of a single card; increase the synergy of more related software.
Combined with data-driven deep learning-assisting programmatic generation, improving the ability and efficiency of art production, making the scale and complexity of the scene meet the needs of the open world, making the character actions realistic and coherent, and completing the scene test efficiently.
A wider range of applications-to penetrate more industries that require real-time, high-fidelity screen presentation and provide immersive experience, and have more integration and interaction with robots, drones, and the Internet of Things.
Facing the future, in the development process of the integration of virtual and real experience, and the full integration of physical and digital, game technologies and tools represented by game engines will have more generalized capabilities, connecting more fields and functions such as film and television, industry, and simulation. And services to create greater social value.
Source: Tencent Research Institute