Machine Learning for Result Estimation of Timing, Resource Usage, and Operation Delay in High Level Synthesis

In high level synthesis, acquiring of accurate result estimation becomes difficult in the earlier stage due to complex optimizations in the physical synthesis. Hence, there occurs a trade-off between efficiency, which involves evaluating in the HLS stage itself and accuracy, which points to waiting for post synthesis results by HLS tools. The factor of accuracy can be optimised using Machine Learning tools by learning from real benchmarks.

One set of parameters is Estimation of Timing, Resource Usage, and Operation Delay : The main methodology is to train an ML model that takes HLS reports as input and outputs a more accurate implementation report without conducting the time-consuming post-implementation.

The workflow can be divided broadly in two steps :

Data Processing :

Like every ML model, HLS estimation also requires training and testing data. The HLS and implementation reports are usually collected across individual designs by running each design through the complete C-to-bitstream flow, for various clock periods and targeting different FPGA devices. After that, one can extract features from the HLS reports as inputs and features from implementation reports as outputs. The data being handled here is of a huge quantity. In order to overcome the effect of collinearity and to reduce the dimension of the data and retain only the most significant parts, application of feature selection techniques is done. This removes the unimportant features from the data.

Selected features after dimensionality reduction


      Training the Estimation Models     

      After we have the refined dataset, regression models are trained to estimate post-implementation resource usages and clock periods. Frequently used metrics to report the estimation error include relative absolute error (RAE) and relative root mean squared error (RMSE). RAE and RMSE are favoured to be as low as possible. 

𝑦’  = vector of values predicted by the model,

𝑦 = vector of actual ground truth values in the testing set


y(bar) = mean value of y

N = number of samples

yi’ = predicted value of sample

yi = actual value of sample

The results are in terms of maximum clock frequency, throughput, and throughput-to-area ratio for the RTL code generated by the HLS tool.

Comments

  1. Apart from estimation of the mentioned parameters, where else can it be used?

    ReplyDelete
    Replies
    1. It can also be used for cross-platform performance prediction.

      Delete

Post a Comment

Popular posts from this blog

FAST STATIC IR DROP PREDICTION USING MACHINE LEARNING

Machine Learning for Analog Layout

High Level Synthesis using Linear Regression