57 lines
2.2 KiB
Markdown
57 lines
2.2 KiB
Markdown
# 'mesh' Dialect
|
|
|
|
The `mesh` dialect contains a set of attributes, operations and interfaces that
|
|
are useful for representing sharding and communication on a device mesh
|
|
cluster.
|
|
|
|
[TOC]
|
|
|
|
## Collective Communication Operations
|
|
There are a number of operations in the Mesh dialect to facilitate
|
|
communication between devices in a mesh.
|
|
It is assumed that the user is familiar with collective operations.
|
|
[Wikipedia](https://en.wikipedia.org/wiki/Collective_operation) has a good
|
|
explanation.
|
|
The main addition is that the collectives in this dialect have mesh
|
|
semantics.
|
|
|
|
### Device groups
|
|
The operation attributes `mesh` and `mesh_axes` specifies a list of device mesh
|
|
axes that partition the devices into disjoint groups.
|
|
The collective operation is performed between devices in the same group.
|
|
Devices that have the same coordinates outside of axes `mesh_axes` are in the
|
|
same group.
|
|
A group is described by its multi-index along the axes outside of `mesh_axes`.
|
|
For example if we have a device mesh of size `2x3x4x5` and the partition mesh
|
|
axes list is `[0, 1]` then devices are partitioned into the groups
|
|
`{ { (i, j, k, m) | 0<=i<2, 0<=j<3 } | 0<=k<4, 0<=m<5 }`.
|
|
The device groups would be `{ (k, m) | 0<=k<4, 0<=m<5 }`.
|
|
Devices (1, 0, 2, 3) and (1, 1, 2, 3) will be in the same group.
|
|
Device (1, 0, 2, 4) will be in another group.
|
|
Some collective operations like all-to-all and all-gather care about the
|
|
order of devices.
|
|
The order of device in a device group is induced by the order of axes in
|
|
`mesh_axes`.
|
|
The axes are ordered from outer to inner.
|
|
If we have an axis list `[3, 1]` then device `(i, 1, k, 0)` will precede
|
|
both devices `(i, 0, k, 1)` and `(i, 2, k, 0)`.
|
|
|
|
### In-group Device
|
|
Some operations like `broadcast`, `scatter` and `send` specify devices in each
|
|
device-group.
|
|
These devices are represented with their multi-index over the mesh axes that
|
|
are not constant within a device group.
|
|
These are the axes specified by `mesh_axes` attribute.
|
|
|
|
For Example on a 3D mesh an operation with `mesh_axes = [0, 2]` would specify
|
|
an in-group device with `(i, j)`. Then for each group with index `g` on the
|
|
second axis, the in-group device would be `(i, g, j)`.
|
|
|
|
|
|
## Operations
|
|
|
|
[include "Dialects/MeshOps.md"]
|
|
|
|
## Attributes
|
|
|
|
[include "Dialects/MeshAttributes.md"]
|