OPML: Machine Learning with Optimistic Rollup System

Source: github; Compile: MarsBit

TL;DR

We propose OPML (Optimistic Machine Learning), which can use Optimistic methods for AI model reasoning and training/fine-tuning of blockchain systems.

Compared with ZKML, OPML can provide low-cost and high-efficiency ML services. The participation requirements for OPML are low: we are now able to run OPML with large language models such as 7B-LLaMA (model size ~26GB) on an ordinary PC without a GPU.

OPML employs a verification game (similar to Truebit and Optimistic Rollup systems) to guarantee the decentralization and verifiable consensus of ML services.

  • The requester first starts an ML service task.
  • Then, the server completes the ML service task and submits the result to the chain.
  • The verifier will verify the result. Assume that there is a validator claiming that the result is false. It starts the verification game with the server (bipartite agreement) and tries to disprove the claim by pinpointing a specific wrong step.
  • Finally, a single-step arbitration takes place on the smart contract.

Single Phase Verification Game

The single-phase pinpointing protocol works similarly to delegation of computation (RDoC), where it is assumed that two or more parties (with at least one honest party) execute the same procedure. Both parties can then question each other with precision to identify disputed steps. Send the steps to a less computationally powerful judge (smart contract on the blockchain) for arbitration.

In single-stage OPML:

  • We built a virtual machine (VM) for off-chain execution and on-chain arbitration. We guarantee the equivalence of off-chain VMs and on-chain VMs implemented on smart contracts.
  • To ensure the efficiency of AI model inference in virtual machines, we implemented a lightweight DNN library designed specifically for this purpose, instead of relying on popular ML frameworks such as Tensorflow or PyTorch. Additionally, a script is provided that converts Tensorflow and PyTorch models to this lightweight library.
  • Use cross-compilation technology to compile the artificial intelligence model reasoning code into virtual machine program instructions.
  • The virtual machine image is managed with a Merkle tree, and only the Merkle root will be uploaded to the smart contract on the chain. (Merkel root represents the virtual machine state)

OPML: Machine Learning with Optimistic Rollup System

  • The bipartite agreement will help locate the dispute step, which will be sent to the arbitration contract on the blockchain

OPML: Machine Learning with Optimistic Rollup System

Performance: We tested a basic AI model (DNN model for MNIST classification) on PC. We were able to complete the DNN inference within 2 seconds in the VM, and the entire challenge process could be completed within 2 minutes in the local Ethereum test environment.

Multi-stage verification game

Limitations of Single-Phase Pinpointing Protocols

The single-stage verification game has a serious disadvantage: all calculations must be performed inside a virtual machine (VM), which prevents us from exploiting the full potential of GPU/TPU acceleration or parallel processing. Therefore, this limitation severely hinders the efficiency of large model inference, which is also consistent with the limitation of the current RDoC protocol.

Transition to Multi-Phase Protocol

To address the limitations imposed by the single-phase protocol and ensure that OPML can achieve performance levels comparable to native environments, we propose an extension to the multi-phase protocol. Using this approach, we only need to perform computations in the VM at the final stage, similar to a single-stage protocol. For other stages, we have the flexibility to perform computations to achieve state transitions in a native environment, leveraging the power of CPUs, GPUs, TPUs, and even parallel processing. By reducing the dependency on the VM, we significantly reduce the overhead and thus significantly improve the execution performance of OPML, almost similar to the native environment.

The figure below demonstrates a verification game consisting of two phases (k = 2). In Phase 1, the process resembles a single-stage verification game, where each state transition corresponds to a single VM uop that changes the state of the virtual machine. In phase 2, state transitions correspond to "big instructions" that contain multiple uops that change the computational context.

Committers and verifiers will first use the bipartite agreement to start the second phase of the verification game to locate disputed steps on the "big order". This step will send to the next phase, phase -1. The first stage works like a single-stage verification game. Phase 1 bipartite agreement will help locate disputed steps on VM uops. This step will be sent to the arbitration contract on the blockchain.

To ensure the integrity and security of the transition to the next phase, we rely on Merkle trees. This operation consists of extracting Merkle subtrees from higher-level stages, thus guaranteeing a seamless continuation of the verification process.

OPML: Machine Learning with Optimistic Rollup System

Multi-stage OPML

In this presentation, we propose the two-stage OPML approach used in the LLaMA model:

  • The calculation process of machine learning (ML), especially deep neural network (DNN), can be expressed as a calculation graph, denoted as G. The graph consists of various computing nodes capable of storing intermediate computing results.
  • DNN model reasoning is essentially a calculation process on the above calculation graph. The whole graph can be seen as the inference state (computational context in Phase-2). As each node is computed, the result is stored in that node, thereby advancing the computation graph to the next state.

OPML: Machine Learning with Optimistic Rollup System

  • Therefore, we can first perform the verification game on the computational graph (in phase-2). In the second stage of the verification game, the calculation of the graph node can be performed in the local environment using multi-threaded CPU or GPU. The bipartite agreement will help locate the disputed node whose computation will be sent to the next phase (phase-1) of the bipartite agreement.
  • In the first-phase bisection, we convert the computation of a single node into virtual machine (VM) instructions, similar to what is done in the single-phase protocol.

It is worth noting that we anticipate the introduction of multi-stage OPML methods (comprising more than two stages) when the computation of a single node in the computational graph is still computationally complex. This extension will further improve the overall efficiency and effectiveness of the verification process.

Performance improvements

Here, we provide a brief discussion and analysis of our proposed multi-stage verification framework.

Assuming that there are n nodes in the DNN calculation graph, each node needs to fetch m VM microinstructions to complete the calculation in the VM. Assume that the computing speedup ratio for each node using GPU or parallel computing is α. This ratio represents the speedup achieved by GPU or parallel computing and can reach significant values, often tens or even hundreds of times faster than VM execution.

Based on these considerations, we draw the following conclusions:

  1. The two-stage OPML is superior to the single-stage OPML, and realizes calculation acceleration α times. The use of multi-stage verification allows us to take advantage of the accelerated computing power provided by GPUs or parallel processing, thereby significantly improving overall performance.

  2. When comparing the size of Merkle trees, we find that in two-stage OPML, the size is O(m+n), while in single-stage OPML, the size is significantly larger than O(mn). The reduction in Merkle tree size further highlights the efficiency and scalability of the multi-stage design.

In summary, the multi-stage verification framework provides significant performance improvements, ensuring more efficient and faster computations, especially when exploiting the acceleration capabilities of GPUs or parallel processing. Furthermore, the reduced Merkle tree size increases the effectiveness and scalability of the system, making multi-stage OPML the choice for various applications.

Consistency and Determinism

In OPML, ensuring the consistency of ML results is critical.

During the native execution of DNN calculations, especially on different hardware platforms, due to the characteristics of floating-point numbers, differences in execution results may occur. For example, parallel calculations involving floating point numbers, such as (a+b)+c and a+(b+c), often produce different results due to round-off errors. In addition, factors such as programming language, compiler version, and operating system can all affect the calculation results of floating-point numbers, leading to further inconsistencies in ML results.

To address these challenges and guarantee the consistency of OPML, we adopted two key approaches:

  1. Using fixed-point algorithm, also known as quantization technology. This technique allows us to represent and perform calculations using fixed precision rather than floating point numbers. By doing this, we mitigate the effects of floating point rounding errors, resulting in more reliable and consistent results.

  2. We utilize software-based floating-point libraries designed to maintain consistent functionality across different platforms. These libraries ensure cross-platform consistency and determinism of ML results, regardless of the underlying hardware or software configuration.

By combining fixed-point arithmetic and software-based floating-point libraries, we have established a solid foundation for consistent and reliable ML results within the OPML framework. This coordination of techniques allows us to overcome the inherent challenges posed by floating-point variables and platform differences, ultimately enhancing the integrity and reliability of OPML computations.

OPML vs ZKML

OPML: Machine Learning with Optimistic Rollup System

*: In the current OPML framework, our main focus is on inference of ML models, enabling efficient and safe model computation. However, it must be emphasized that our framework also supports the training process, making it a general solution for various machine learning tasks.

Note that OPML is still under development. If you are interested in being part of this exciting program and contributing to the OPML project, please feel free to contact us.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)