Skip to content

Commit efc969d

Browse files
mbs-octomlyangulei
authored andcommitted
Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes. (apache#9326)
* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes. CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices. Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future. However, we get two nice side effects right away: - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer. - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero. The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope. * [checkpoint] Revert emitter.py, must have run 'black .' by mistake. * [checkpoint] Address PR comments Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle. (try again -- flaky test_crt.py test_autotune?) * [checkpoint] Fix after rebase on CallLowered.
1 parent 42570e7 commit efc969d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+2432
-1951
lines changed

include/tvm/ir/function.h

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -191,24 +191,24 @@ constexpr const char* kTarget = "target";
191191
constexpr const char* kGlobalSymbol = "global_symbol";
192192

193193
/*!
194-
* \brief The device type which will hold each of the functions parameters.
194+
* \brief The SEScope which will hold each of the functions parameters.
195195
*
196196
* Only supported on Relay \p Functions. Generally added by the \p PlanDevices pass, but
197197
* may be included as an annotation on user programs.
198198
*
199-
* Type: Array<Integer> (but interpreted as Array<DLDeviceType>)
199+
* Type: Array<SEScope>
200200
*/
201-
constexpr const char* kParamDeviceTypes = "param_device_types";
201+
constexpr const char* kParamSEScopes = "param_se_scopes";
202202

203203
/*!
204-
* \brief The device type which will hold the function result.
204+
* \brief The SEScope which will hold the function result.
205205
*
206206
* Only supported on Relay \p Functions. Generally added by the \p PlanDevices pass, but
207207
* may be included as an annotation on user programs.
208208
*
209-
* Type: Integer (but interpreted as DLDeviceType)
209+
* Type: SEScope
210210
*/
211-
constexpr const char* kResultDeviceType = "result_device_type";
211+
constexpr const char* kResultSEScope = "result_se_scope";
212212

213213
} // namespace attr
214214
} // namespace tvm

include/tvm/relay/attrs/annotation.h

Lines changed: 0 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -31,68 +31,6 @@
3131
namespace tvm {
3232
namespace relay {
3333

34-
/*!
35-
* \brief Attributes for the "on_device" special operator.
36-
*
37-
* The Relay call (aka 'annotation'):
38-
* \code
39-
* on_device(sub_expr, device_type=2)
40-
* \endcode
41-
* constrains \p sub_expr to execute and store its result on a device with \p DLDeviceType \p 2
42-
* (i.e. a \p kDLCuda device). However the annotation itself may appear in an expression to be
43-
* executed and stored on a different device. If so the compiler will automatically insert a
44-
* "device_copy" call to mediate the transition between devices.
45-
*
46-
* E.g.: Assuming %x and %y reside on the GPU and %z on the CPU then:
47-
* \code
48-
* multiply(on_device(add(%x, %y), device_type=2), %z)
49-
* \endcode
50-
* indicates the \p add should execute on the GPU but the \p multiply should execute on the CPU.
51-
* The compiler will rewrite this to:
52-
* \code
53-
* multiply(device_copy(add(%x, %y), src_dev_type=2, dst_dev_type=1), %z)
54-
* \endcode
55-
*
56-
* The Relay call
57-
* \code
58-
* on_device(sub_expr, device_type=2, is_fixed=True)
59-
* \endcode
60-
* is similar to the above, however the annotation itself must appear in an expression on the
61-
* same device. The compiler will check the devices are consistent, and will not insert any
62-
* "device_copy" call. This form of annotation shouldn't be necessary in user programs. However
63-
* it is needed by the \p PlanDevices pass to fully specify the results of device planning so that
64-
* the pass is idempotent.
65-
*
66-
* E.g.: The following program is equivalent to the above:
67-
* \code
68-
* let %a = on_device(add(%x, %y), device_type=2, is_fixed=True)
69-
* multiply(device_copy(%a, src_dev_type=2, dst_dev_type=1), %z)
70-
* \endcode
71-
* The "on_device" annotation with \p is_fixed=True indicates unambiguously that \p %a is stored
72-
* on the GPU.
73-
*/
74-
struct OnDeviceAttrs : public tvm::AttrsNode<OnDeviceAttrs> {
75-
// TODO(mbs): Replace device types with TargetDevice.
76-
/*! \brief Device type on which argument expression should be evaluated. */
77-
int device_type = kInvalidDeviceType;
78-
/*!
79-
* \brief If true, the result device must also be \p device_type and device planning should
80-
* not insert any "device_copy" calls to respect this annotation.
81-
*
82-
* This is used by the device planning pass itself when annotating the planned program.
83-
*/
84-
bool is_fixed = false;
85-
86-
TVM_DECLARE_ATTRS(OnDeviceAttrs, "relay.attrs.OnDeviceAttrs") {
87-
TVM_ATTR_FIELD(device_type)
88-
.describe("The type of the virtual device which should hold the expression result.")
89-
.set_default(0);
90-
TVM_ATTR_FIELD(is_fixed)
91-
.describe("If true, do not insert a \"device_copy\" call to respect this annotation.")
92-
.set_default(false);
93-
}
94-
};
95-
9634
/*!
9735
* \brief Annotate an expression to be cast into specific data type.
9836
*/

include/tvm/relay/attrs/device_copy.h

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
#define TVM_RELAY_ATTRS_DEVICE_COPY_H_
2626

2727
#include <tvm/ir/attrs.h>
28+
#include <tvm/target/se_scope.h>
2829

2930
#include <string>
3031

@@ -35,17 +36,14 @@ namespace relay {
3536
* \brief Options for the device copy operators.
3637
*/
3738
struct DeviceCopyAttrs : public tvm::AttrsNode<DeviceCopyAttrs> {
38-
// TODO(mbs): Should be TargetDevice.
39-
int dst_dev_type;
40-
int src_dev_type;
39+
SEScope src_se_scope = SEScope::FullyUnconstrained();
40+
SEScope dst_se_scope = SEScope::FullyUnconstrained();
4141

4242
TVM_DECLARE_ATTRS(DeviceCopyAttrs, "relay.attrs.DeviceCopyAttrs") {
43-
TVM_ATTR_FIELD(src_dev_type)
44-
.describe("The virtual device/context type where the op copies data from.")
45-
.set_default(0);
46-
TVM_ATTR_FIELD(dst_dev_type)
47-
.describe("The virtual device/context type where the op copies data to.")
48-
.set_default(0);
43+
TVM_ATTR_FIELD(src_se_scope)
44+
.describe("The (virtual) device and scope where the op copies data from.");
45+
TVM_ATTR_FIELD(dst_se_scope)
46+
.describe("The (virtual) device and scope where the op copies data to.");
4947
}
5048
};
5149

include/tvm/relay/attrs/memory.h

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626

2727
#include <tvm/ir/attrs.h>
2828
#include <tvm/relay/expr.h>
29+
#include <tvm/target/se_scope.h>
2930

3031
#include <string>
3132
#include <vector>
@@ -42,15 +43,13 @@ Expr ToTupleType(const Type& t, const std::vector<Expr>& exprs);
4243
*/
4344
struct AllocStorageAttrs : public tvm::AttrsNode<AllocStorageAttrs> {
4445
DataType dtype;
45-
int device_id;
46-
int device_type;
46+
SEScope se_scope = SEScope::FullyUnconstrained();
4747

4848
TVM_DECLARE_ATTRS(AllocStorageAttrs, "relay.attrs.AllocStorageAttrs") {
4949
TVM_ATTR_FIELD(dtype)
5050
.describe("The dtype of the tensor to allocate.")
5151
.set_default(DataType::Float(32, 1));
52-
TVM_ATTR_FIELD(device_id).describe("The device id on which to allocate memory.");
53-
TVM_ATTR_FIELD(device_type).describe("The device type on which to allocate memory.");
52+
TVM_ATTR_FIELD(se_scope).describe("The SEScope on which to allocate memory.");
5453
}
5554
};
5655

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing,
13+
* software distributed under the License is distributed on an
14+
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
* KIND, either express or implied. See the License for the
16+
* specific language governing permissions and limitations
17+
* under the License.
18+
*/
19+
20+
/*!
21+
* \file tvm/relay/attrs/on_device.h
22+
* \brief Attribute for the on device annotation.
23+
*/
24+
#ifndef TVM_RELAY_ATTRS_ON_DEVICE_H_
25+
#define TVM_RELAY_ATTRS_ON_DEVICE_H_
26+
27+
#include <tvm/ir/attrs.h>
28+
#include <tvm/target/se_scope.h>
29+
30+
#include <string>
31+
32+
namespace tvm {
33+
namespace relay {
34+
35+
/*!
36+
* \brief Attributes for the "on_device" special operator.
37+
*
38+
* The Relay call (aka 'annotation'):
39+
* \code
40+
* on_device(sub_expr, se_scope=S)
41+
* \endcode
42+
* constrains \p sub_expr to execute and store its result on the \p SEScope \p S.
43+
* However the annotation itself may appear in an expression to be executed and stored on a
44+
* different \p SEScope. If so the compiler will automatically insert a "device_copy" call to
45+
* mediate the transition between \p SEScopes.
46+
*
47+
* E.g.: Assuming %x and %y reside on the GPU and %z on the CPU then:
48+
* \code
49+
* multiply(on_device(add(%x, %y), se_scope=GPU), %z)
50+
* \endcode
51+
* indicates the \p add should execute on the GPU but the \p multiply should execute on the CPU.
52+
* The compiler will rewrite this to:
53+
* \code
54+
* multiply(device_copy(add(%x, %y), src_se_scope=GPU, dst_se_scope=CPU), %z)
55+
* \endcode
56+
*
57+
* The Relay call
58+
* \code
59+
* on_device(sub_expr, se_scope=S, is_fixed=True)
60+
* \endcode
61+
* is similar to the above, however the annotation itself must appear in an expression on the
62+
* same \p SEScope \p S. The compiler will check the \p SEScopes are consistent, and will not
63+
* insert any "device_copy" call. This form of annotation shouldn't be necessary in user programs.
64+
* However it is needed by the \p PlanDevices pass to fully specify the results of device planning
65+
* so that the pass is idempotent.
66+
*
67+
* E.g.: The following program is equivalent to the above:
68+
* \code
69+
* let %a = on_device(add(%x, %y), se_scope=GPU, is_fixed=True)
70+
* multiply(device_copy(%a, src_se_scope=GPU, dst_se_scope=CPU), %z)
71+
* \endcode
72+
* The "on_device" annotation with \p is_fixed=True indicates unambiguously that \p %a is stored
73+
* on the GPU.
74+
*/
75+
struct OnDeviceAttrs : public tvm::AttrsNode<OnDeviceAttrs> {
76+
/*!
77+
* \brief (Virtual) \p SEScope on which the result of the argument expression should be stored.
78+
*/
79+
SEScope se_scope = SEScope::FullyUnconstrained();
80+
/*!
81+
* \brief If true, the result \p SEScope must also be \p se_scope, and device planning should
82+
* not insert any "device_copy" calls to respect this annotation.
83+
*
84+
* This is used by the device planning pass itself when annotating the planned program.
85+
*/
86+
bool is_fixed = false;
87+
88+
TVM_DECLARE_ATTRS(OnDeviceAttrs, "relay.attrs.OnDeviceAttrs") {
89+
TVM_ATTR_FIELD(se_scope)
90+
.describe("The (virtual) device and scope holding the expression result.")
91+
.set_default(SEScope::FullyUnconstrained());
92+
TVM_ATTR_FIELD(is_fixed)
93+
.describe("If true, do not insert a \"device_copy\" call to respect this annotation.")
94+
.set_default(false);
95+
}
96+
};
97+
98+
} // namespace relay
99+
} // namespace tvm
100+
101+
#endif // TVM_RELAY_ATTRS_ON_DEVICE_H_

include/tvm/relay/transform.h

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@
3030
#include <tvm/relay/function.h>
3131
#include <tvm/relay/op.h>
3232
#include <tvm/relay/op_attr_types.h>
33+
#include <tvm/target/compilation_config.h>
34+
#include <tvm/target/se_scope.h>
3335
#include <tvm/target/target.h>
3436

3537
#include <string>
@@ -437,23 +439,27 @@ TVM_DLL Pass RelayToTIRTargetHook();
437439
* \brief A pass for manifesting explicit memory allocations and rewriting
438440
* specific dialects.
439441
*
440-
* \param target_host The target used by the host for compilation.
441-
* \param targets The device type and target pairs for compilation.
442+
* \param cpu_se_scope SEScope for computations and data which must reside on a CPU, such as
443+
* shapes and shape functions.
442444
*
443445
* \return The pass.
444446
*/
445-
TVM_DLL Pass ManifestAlloc(Target target_host, Map<tvm::Integer, tvm::Target> targets);
447+
TVM_DLL Pass ManifestAlloc(SEScope cpu_se_scope);
446448

447449
/*!
448-
* \brief Uses existing "on_device" and "device_copy" CallNodes to infer the device on which
449-
* every Relay sub-expression should run (and the result stored). Captures the result of that
450-
* analysis using new "on_device" and "device_copy" CallNodes. See
451-
* tvm::relay::transform::{LexicalOnDeviceMixin,DeviceAwareExprVisitor,DeviceAwareExprMutator}
450+
* \brief Uses existing "on_device" and "device_copy" CallNodes to infer the \p SEScope on which
451+
* every Relay sub-expression should run and the result stored. Captures the result of that
452+
* analysis using new "on_device" and "device_copy" CallNodes.
453+
*
454+
* See tvm::relay::transform::{LexicalOnDeviceMixin,DeviceAwareExprVisitor,DeviceAwareExprMutator}
452455
* for help recovering the device for an arbitrary sub-expression in downstream transformations.
453456
*
454-
* \param default_device_type DLDeviceType for default device.
457+
* \param config Describes the targets and default \p SEScope for all primitive operators and
458+
* host sub-expressions.
459+
*
460+
* \return The pass.
455461
*/
456-
TVM_DLL Pass PlanDevices(DLDeviceType default_device_type);
462+
TVM_DLL Pass PlanDevices(CompilationConfig config);
457463

458464
} // namespace transform
459465

include/tvm/runtime/vm/bytecode.h

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,7 @@ struct Instruction {
176176
RegName object;
177177
} get_tag;
178178
struct /* AllocADT Operands */ {
179+
// TODO(mbs): Needs a DeviceAndScope.
179180
/*! \brief The datatype's constructor tag. */
180181
Index constructor_tag;
181182
/*! \brief The number of fields to store in the datatype. */
@@ -184,6 +185,7 @@ struct Instruction {
184185
RegName* datatype_fields;
185186
};
186187
struct /* AllocClosure Operands */ {
188+
// TODO(mbs): Needs a DeviceAndScope.
187189
/*! \brief The index into the function table. */
188190
Index clo_index;
189191
/*! \brief The number of free variables to capture. */
@@ -198,8 +200,8 @@ struct Instruction {
198200
Index alignment;
199201
/*! \brief The hint of the dtype. */
200202
DLDataType dtype_hint;
201-
/*! \brief The device type of the allocation. */
202-
Index device_type;
203+
/*! \brief The index of the device on which the allocation will be made. */
204+
Index device_index;
203205
} alloc_storage;
204206
struct /* ShapeOf Operands */ {
205207
RegName tensor;
@@ -210,11 +212,11 @@ struct Instruction {
210212
} reshape_tensor;
211213
struct /* DeviceCopy Operands */ {
212214
RegName src;
213-
/*! \brief The source device type. */
214-
Index src_device_type;
215-
/*! \brief The destination device type. */
216-
Index dst_device_type;
217-
};
215+
/*! \brief The index of the source device to copy from. */
216+
Index src_device_index;
217+
/*! \brief The index of the destination deviceto copy to. */
218+
Index dst_device_index;
219+
} device_copy;
218220
};
219221

220222
/*!
@@ -352,12 +354,12 @@ struct Instruction {
352354
* \param size The size of the allocation.
353355
* \param alignment The allocation's alignment.
354356
* \param dtype_hint The data type hint for the allocator.
355-
* \param device_type The device type for the allocator.
357+
* \param device_index The index of the device to allocate on.
356358
* \param dst The destination to place the storage.
357359
* \return The alloc storage instruction.
358360
*/
359361
static Instruction AllocStorage(RegName size, Index alignment, DLDataType dtype_hint,
360-
Index device_type, RegName dst);
362+
Index device_index, RegName dst);
361363
/*!
362364
* \brief Get the shape of an input tensor.
363365
* \param tensor The input tensor.
@@ -376,12 +378,12 @@ struct Instruction {
376378
/*!
377379
* \brief Copy tensor cross different devices.
378380
* \param src The source register.
379-
* \param src_device_type The device type of the tensor for the source register.
380-
* \param dst_device_type The device type of the tensor ofr the destination register.
381+
* \param src_device_index The index of the device holding the tensor in the source register.
382+
* \param dst_device_index The index of the device to hold the tensor in the destination register.
381383
* \param dst The destination register to store the copied tensor.
382384
* \return The device copy instruction.
383385
*/
384-
static Instruction DeviceCopy(RegName src, Index src_device_type, Index dst_device_type,
386+
static Instruction DeviceCopy(RegName src, Index src_device_index, Index dst_device_index,
385387
RegName dst);
386388

387389
Instruction();

0 commit comments

Comments
 (0)