This page records my notes about exploring MIGraphX, henceforth MGX.
Overview
When reading a ONNX (or other formats) model, MGX will try to parse and represent it with the abstraction.
MGX will first try to finalize all operations in a program (i.e., all of its modules and all instructions in the modules), then invoke the compute
function defined in each operation to evaluate the model.
Parse and compile an ONNX model
Compiler
Pass
When specifying a target (CPU, GPU or FPGA), MGX will obtain all its necessary passes via std::vector<pass> target::get_passes
(e.g., from src/targets/gpu/target.cpp
).
It returns all registered and supported optimization passes based on environment variables and device types.
Finalize
After applying the optimization passes, the program will be finalized (determining what kernels to be used to achieve the semantic of operation).
Meanwhile, the operator parameters inferred at compile time will also be dumped to the saved .mxr
model file, e.g., the MIOpen solution ID obtained from compile time tuning.
Evaluate and run a compiled program
After creating a program, users can now call the eval
function to run the model.
Quick ref: common data structures
Type erasure
Many data structures in MGX follows the type erasure design pattern to hide the type information and expose the unified interfaces to outside.
Usually, the xxx
struct contains the xxx_impl
unique pointer with the real data member stored.
Common code snippets
template <class T, class F, F f> // NOLINT
using manage_ptr = std::unique_ptr<T, manage_deleter<F, f>>;
#define MIGRAPHX_MANAGE_PTR(T, F) \
migraphx::manage_ptr<std::remove_pointer_t<T>, decltype(&F), &F>
Macro MIGRAPHX_MANAGE_PTR
will return a std::unique_ptr
with the destructor set.
Values
Shapes and arguments
struct MIGRAPHX_EXPORT shape
{
shape(type_t t, std::vector<std::size_t> l);
shape(type_t t, std::vector<std::size_t> l, std::vector<std::size_t> s);
// for dynamic tensor shape
struct MIGRAPHX_EXPORT dynamic_dimension
{
std::size_t min = 0;
std::size_t max = 0;
std::set<std::size_t> optimals{};
};
std::size_t elements() const;
std::size_t bytes() const;
};
shape
is described as dimension sizes and optional stride sizes.
In additional to fixed shape tensors, MGX also supports shapes with dynamic dimensions (with min and max values alongside the dimension).
struct MIGRAPHX_EXPORT argument : raw_data<argument>
{
template <class T>
argument(shape s, T* d)
: m_shape(std::move(s))
{
assign_buffer([d] { return reinterpret_cast<char*>(d); });
}
template <class T>
argument(shape s, std::shared_ptr<T> d)
: m_shape(std::move(s))
{
assign_buffer([d] { return reinterpret_cast<char*>(d.get()); });
}
private:
void assign_buffer(std::function<char*()> d);
struct data_t
{
std::function<char*()> get = nullptr;
std::vector<data_t> sub = {};
data_t share() const;
static data_t from_args(const std::vector<argument>& args);
};
argument(const shape& s, const data_t& d);
shape m_shape;
data_t m_data{};
};
argument
is the structure to place the substantial data described by shape
.
Instructions and op
using instruction_ref = std::list<instruction>::iterator;
// members in an instruction
struct MIGRAPHX_EXPORT instruction
{
operation op;
shape result{};
std::vector<instruction_ref> output;
std::vector<instruction_ref> arguments;
std::vector<module_ref> module_args;
literal lit;
bool normalized = false;
std::size_t target_id = 0;
};
Modules
struct MIGRAPHX_EXPORT module
{
std::unique_ptr<module_impl> impl;
};
struct module_impl
{
// A list is used to keep references to an instruction stable
std::list<instruction> instructions;
std::unordered_set<instruction*> instruction_set;
std::string name;
uint32_t nparams = 0;
bool bypass = false;
};
Programs
struct MIGRAPHX_EXPORT program
{
std::unique_ptr<program_impl> impl;
};
struct program_impl
{
// A map is used to keep references to modules of the program
std::unordered_map<std::string, module> modules;
std::vector<context> contexts;
std::vector<target> targets;
};
Target and context
struct context
{
context(std::size_t device_id = 0, std::size_t n = value_of(MIGRAPHX_NSTREAMS{}, 1))
: current_device(std::make_shared<hip_device>(device_id, n)),
begin_event(create_event()),
finish_event(create_event())
{
}
private:
// TODO: Make this a vector to support multiple devices
std::shared_ptr<hip_device> current_device;
std::vector<shared<hip_event_ptr>> events;
bool exhaustive_tune = false;
bool measure_perf = false;
// for event perf timing
shared<hip_event_ptr> start_event = nullptr;
shared<hip_event_ptr> stop_event = nullptr;
// for stream syncronization
shared<hip_event_ptr> begin_event = nullptr;
shared<hip_event_ptr> finish_event = nullptr;
problem_cache pc{};
};
context
is an abstraction for managing HIP device (GPU, stream and event).
struct MIGRAPHX_GPU_EXPORT target
{
std::string name() const;
std::vector<pass> get_passes(migraphx::context& gctx, const compile_options& options) const;
migraphx::context get_context() const;
argument copy_to(const argument& arg) const;
argument copy_from(const argument& arg) const;
argument allocate(const shape& s) const;
};
target
rather defines the possible "actions" (both compile time and execution time) taken on the device.
Ranges
template <class Iterator>
struct iterator_range
{
Iterator start;
Iterator last;
Iterator begin() const { return start; }
Iterator end() const { return last; }
};
template <class Iterator, MIGRAPHX_REQUIRES(not std::is_integral<Iterator>{})>
iterator_range<Iterator> range(Iterator start, Iterator last)
{
return {start, last};
}
inline iterator_range<iota_iterator> range(std::ptrdiff_t start, std::ptrdiff_t last)
{
return {{start, {}}, {last, {}}};
}
inline iterator_range<iota_iterator> range(std::ptrdiff_t last) { return range(0, last); }
Use case: for (auto i : range(N)) {}
, which will iterative i
from 0
to N - 1
.
Useful links
- https://zhuanlan.zhihu.com/p/682962732
- https://blog.csdn.net/qianqing13579/article/details/124917730