Random Tech News

“First-class technology” focused on the development of a new generation of deep learning framework, completed 50 million yuan A round of financing exclusively led by Hillhouse Ventures

#Firstclass #technology #focused #development #generation #deep #learning #framework #completed #million #yuan #financing #exclusively #led #Hillhouse #Ventures

Countless companies have used machine learning to solve specific problems, and have landed in various industries: there are landing face recognition, landing voice recognition and translation, landing advertisement recommendation, landing finance and insurance, landing automatic driving, and There are real estate lines to improve production efficiency. Although the founders of AI, Marvin Minsky and others, in 2004 urged everyone not to pay too much attention to AI solving specific problems such as playing chess, driving, translation, etc., but to pay more attention to AI thinking and acting like humans, but they still don’t. Hinder the efforts of various industries to solve specific problems.

Deep learning framework is the most important and core infrastructure software after Hadoop and Spark. First-class technology is a company that focuses on the development of a new generation of deep learning framework. Founder Yuan Jinhui stated to reporters at 36氪 that he decided to start a business To be open source, but the latecomers must have a blockbuster strength to succeed in open source. Common sense believes that it is difficult to develop a general deep learning framework as a startup company to compete with major manufacturers, but the competition on this track is ultimately a product quality competition. Such products win out of competition in the final analysis by innovation ability, and innovation ability is not seen in startups. It is weaker than the giants. The first-class goal is to create an advanced deep learning framework that is popular with users in the world and reduce the cost and difficulty of enterprises using artificial intelligence technology. After AI, it will be used as infrastructure software. Each enterprise has its own application and evolution. Open source is difficult to be widely used.

OneBrain platform operation interface

After four years of development of first-class technology, the R&D team has expanded from 3 to 40, and completed the world’s rare deep learning framework-OneFlow that can support ultra-large-scale deep learning model training, and the AI ​​open platform for users-OneBrain . Yuan Jinhui said that OneFlow and OneBrain are universal and can solve most of the problems encountered in various industries, not just solving specific problems. First-class technology deep learning products have completely independent intellectual property rights, have obtained more than ten invention patents, and have four core technologies of automatic data model hybrid parallelism, static scheduling, decentralization and full-link asynchronous streaming execution, which solves the problem of heterogeneous The big data, big model, and big computing challenges faced by training large-scale deep learning models on distributed clusters. In the “First Round of Open Source Deep Learning Software Framework Test Report” released by the China Academy of Information and Communications Technology in May 2020, the OneFlow framework significantly leads foreign products in performance indicators under the same algorithm and hardware conditions.

OneFlow framework

The training and application of super-large-scale neural network models is a new direction driven by Google and OpenAI around 2019. Although the previous neural network has a lot of data, the model is small, the number of layers and the edges are small, with only tens of millions or billions of parameters. In 2019, Google began to develop large models, with billions and hundreds of billions of model parameters. The benefits of a large model are simply: make a large model first, and then train. The same amount of data can get better results than a small model. In many scenarios, small model training cannot get the accuracy of the large model. After training, the large model can be redundant. The parameters are removed to reduce the model, and finally deployed and used. Large models exceed common convolutional neural network methods in many application scenarios, such as non-visual applications such as NLP, applications with large parameters such as face recognition, and advertising recommendation industries. At present, first-class technology customers include Zhijiang Laboratory, Beijing Zhiyuan Artificial Intelligence Research Institute, Zhongguancun Intelligent Application Research Institute, Chinese Academy of Sciences Institute of Automation, Sogou, Shenxue and other enterprises. It is reported that first-class technology will start exploring business models similar to the verified public cloud services of Snowflake and Databricks in 2021.

Four core technologies of first-class technology

Yuan Jinhui stated to reporter 36氪 that the main requirements of users for the open source deep learning framework are as follows:

  1. Operational efficiency: Under the same algorithm and hardware conditions, the efficiency must be fast, which can reduce the cost of users to the greatest extent, so that there will be practical prospects. This dimension is the most technically challenging, and it is also the most advantageous place of OneFlow. ;

  2. Ease of use: Ease of use is a point overlooked by many AI academics and companies. At present, PyTorch is the best in ease of use. This is also one of the reasons why it can be used in a large range. The goal of first-class technology is to make OneFlow a new version As easy to use as PyTorch;

  3. Whether the framework tools are complete: OneBrain open platform also helps users to do data annotation, cleaning, model library, visualization, training deployment, cluster management, etc., so that users don’t worry about the underlying issues, and they have to build tanks without needing to fight.

Goal: Make AI a standardized product

Artificial Intelligence has developed rapidly around the world in recent years, from perception to learning, but what characteristics can be counted as AI? The authoritative explanation in Stuart J. Russell “Artificial Intelligence A Modern Approach” is divided into four Directions:Think like a human, act like a human, think purely, act purely. In these four directions, different scholars and entrepreneurs are exploring in different ways, and more and more subjects are involved, not just pure mathematics and computer engineering:

  1. For example, the Turing test is to determine whether a machine acts like a human being. This direction alone is involvedNatural language processing NLP-Let the machine communicate in English,Information storage-Let the machine store the sensed,logical analysis-Let the machine use the stored information logical analysis to answer questions,Machine Learning-Let the machine adapt to the unknown environment and discover new patterns and machine vision in it-Let the machine perceive the external environment;

  2. Do machines think like humans and can they help humans solve problems? In 1961, Newell and Simon proposed the General Problem Solver in the field of cognitive science. The purpose of GPS is not logical reasoning, but to simulate humans to solve problems.

  3. Whether the machine is purely rational thinking, logic started to use rigorous sentences and grammar to reason in the 19th century, trying to get the most rational answer, and it was able to computer program in 1965;

  4. Purely rational actions are more complicated, including the two parts of purely rational thinking and actions, which is what we often sayRational Agent, In a given environment, taking the best possible solution is also the direction for many companies to land.

According to public information, the 36氪 reporter has slightly classified the following stages of machine learning in various industries. If there are omissions or errors, please contact the 36氪 reporter:

  1. AI started a commercial expert system in 1982, which is based on rules to do algorithms; then in the 1980s, everyone returned their attention toNeural Networks, But due to the lack of big data and GPU hardware computing power at that time, neural networks did not develop; until the 1990s,Statistical Machine Learning Statistics MLRising, large areas of business have only begun to be used to solve real-world problems, and are no longer limited to the model world. In simple terms, statistical machine learning is to use statistical optimization algorithms to derive laws, that is, use data to adjust algorithms and predict the future, such asDecision tree, Bayesian Network, Support Vector Machine, Adaboost are all statistical machine learning. These more traditional machine learning methods are still widely used in specific fields, such as the insurance and finance industry. Well-known companies in these areas include Fourth Paradigm and Jiuzhang Yunji.

  2. 2012Deep LearningWith the rise, the advantage of neural networks is that they don’t need people to design features, instead of traditional machine learning. At that time, the popular deep learning frameworks included Caffe and Theano declarative programming styles, and Chainer imperative programming styles.One of the most popular and still used models isConvolutional Neural Network CNN, Because the convolutional neural network is inspired by the structure of the biological neural network of the visual pathway of the brain, it is especially suitable for computer vision and image classification (except for face recognition), but it is not suitable for NLP and advertising recommendation system. Companies that have rapidly developed by virtue of the successful application of convolutional neural networks in computer vision include Megvii, Shangtang, Yitu, and Yuncong.

  3. After 2012, everyone kept trying new neural network structures, publishing papers, finding applications, and lettingNeural network structure of small modelGet innovative development, for exampleRecurrent Neural Network RNNIt is used to learn data types with contextual correlation, suitable for language translation or text translation. Because all deep learning belongs to statistical machine learning, the characteristic is that the larger the data, the higher the accuracy with the same algorithm, that is to say, the accuracy can be piled up with the amount of data, which allows big data to release huge potential, such as now Use 3 billion images to train convolutional neural networks, BERT, etc. At present, domestic and foreign open source deep learning frameworks suitable for solving big data and small models include Google open source TensorFlow, Facebook open source PyTorch, Microsoft Research Institute CNTK framework, and Amazon MXNet. Domestic companies include MegEngine open source and Baidu open source Feizhi.

  4. After 2019,Neural network for large modelsDemonstrating the incomparable advantages of small models, how to support the training of ultra-large-scale deep learning models has become an important research and development direction of deep learning frameworks. Google has developed Mesh-Tensorflow and Gpipe technologies based on TensorFlow. Microsoft DeepSpeed ​​has transformed PyTorch, but only supports NLP. Nvidia Megatron-LM only does NLP, and Nvidia HugeCTR only does advertising recommendations. The versatility of these solutions is not good enough and affects them. Promotion and application. Google recently developed GShard technology on XLA to try to solve the versatility problem, but it has not been open source. The domestic deep learning frameworks that have been explored in this direction include Huawei’s open source MindSpore, first-class technology open source OneFlow, and Google GShard.

  5. Starting in 2020,Small sample trainingThe problem is also attracting more and more attention from the industry, focusing on solving scenarios with insufficient data, such as industrial data. Training with small samples can achieve the same accuracy, subverting deep learning, such asReinforcement learning RLmiddleGenerate a confrontation network GANImage data can increase the amount of data by rotating, cropping, or changing the brightness and darkness. At present, there are quite a few fields that use GAN methods to generate data that is very similar to the original data, but because it is relatively novel, there is no mature solution to common problems. Domestic companies that have explored more in the direction of small-sample training include Dark Matter Intelligence and Reliance Intelligence.



Leave a Reply

Your email address will not be published. Required fields are marked *