Classes
Class Buffer
-
class Buffer
Represents a class for managing memory buffer on host CPU and Houmo device.
Key functions are as follows:
Memory allocation on host CPU and Houmo device.
Copying data from a specified memory address on host into the Houmo device buffer.
Copying data from the Houmo device buffer into a specified memory address on host.
Retrieves the pointer to the data stored in the buffer.
Retrieves the size of the allocated buffer.
Buffer
Clone
-
Buffer tcim::Buffer::Clone(bool auto_copy = true) const
Creates a copy of the current Buffer object.
- Parameters:
auto_copy -- [in] Determines if the data stored in the buffer is copied to the new Buffer object.
If set to
true(default), a new buffer is allocated, and both the data stored in the buffer and its associated metadata (such as size and memory type) are copied to the new buffer.If set to
false, a new buffer is allocated, and only the metadata (such as size and memory type) is copied to the new buffer, but its memory is uninitialized and does not contain the original data.
- Returns:
Returns a new Buffer object as a copy of the current Buffer object.
CopyFromHost
-
Status tcim::Buffer::CopyFromHost(const void *src, size_t size, size_t offset = 0)
Copies data from host memory to the buffer on Houmo device.
- Parameters:
src -- [in] Pointer to the source memory address on the host.
size -- [in] The size of buffer memory to be copied in bytes.
offset -- [in] The offset within the buffer where the data will be copied. Default is 0.
- Returns:
Returns the status of the function call.
CopyTo
-
Status tcim::Buffer::CopyTo(tcim::Buffer &dst, size_t size = 0, size_t src_off = 0, size_t dst_off = 0) const
Copies data from the current buffer to the specified destination buffer on the same Houmo device.
- Parameters:
dst -- [out] The destination buffer on the Houmo device.
size -- [in] The number of bytes to copy. If set to 0, the smaller size between the source and destination buffers is used by default.
src_off -- [in] The offset in the current buffer where the copy starts. Default is 0.
dst_off -- [in] The offset in the destination buffer where the copied data is placed. Default is 0.
- Returns:
Returns the status of the function call.
CopyToHost
-
Status tcim::Buffer::CopyToHost(void *dst, size_t size, size_t offset = 0)
Copies data from the buffer on Houmo device to host memory.
- Parameters:
dst -- [out] Pointer to the destination memory address.
size -- [in] The size of buffer memory to be copied in bytes.
offset -- [in] The offset within the buffer to start copying data from. Default is 0.
- Returns:
Returns the status of the function call.
Data
Device
DeviceId
GetInitStatus
GetSubBuffer
-
Buffer tcim::Buffer::GetSubBuffer(size_t size, size_t offset = 0) const
Retrieves a sub-buffer from the current buffer.
This method returns a sub-buffer that shares the same memory as the original buffer. The sub-buffer's lifecycle is tied to the parent buffer, and it does not allocate additional memory.
- Parameters:
size -- The size of the sub-buffer to retrieve, in bytes. Must not exceed the remaining size of the buffer starting from the specified offset.
offset -- The starting position within the buffer from which the sub-buffer begins, in bytes. Defaults to 0 if not specified.
- Returns:
A sub-buffer representing the specified portion of the original buffer.
Note
The returned sub-buffer is a view of the original buffer and does not manage its own memory. Ensure the parent buffer remains valid for the duration of the sub-buffer's usage.
MemSet
-
Status tcim::Buffer::MemSet(int8_t value = 0, size_t size = 0, size_t offset = 0)
Initializes the specified region of the device buffer to a constant byte value.
- Parameters:
value -- [in] The byte value used to initialize the target device memory region. Defaults to
0.size -- [in] The number of bytes to initialize. If set to
0, all bytes fromoffsetto the end of the device buffer are initialized.offset -- [in] Byte offset from the beginning of the device buffer. Defaults to
0.
- Returns:
Returns the status of the function call.
operator=
Size
CreateDeviceBuffer
-
static Buffer tcim::Buffer::CreateDeviceBuffer(size_t size, int device_id = 0, const std::string &backend_name = "", const std::string &mem_type = "")
Allocates a buffer on Houmo device with the given memory size and logical device ID.
- Parameters:
size -- [in] The size of buffer memory to allocate in bytes.
device_id -- [in] The logical device ID of the Houmo device on which the buffer memory is allocated. The device 0 is used by default. You can retrieve the logical device IDs via SMI tool. See "SMI Tool User Guide" for details.
backend_name -- [in] The backend name of the Houmo device. Only the default value can be used. You can retrieve the backend name via Module::GetBackendName.
mem_type -- [in] The special mem_type of the Houmo device. Default is "". *
- Returns:
Returns a new Buffer object with memory allocated on the specified device.
-
static Buffer tcim::Buffer::CreateDeviceBuffer(void *dev_ptr, size_t size, int device_id, const std::string &backend_name)
Creates a Buffer object that represents a memory region on the Houmo device. This function does not allocate new memory but uses the memory region specified by
dev_ptr.Note
You are responsible for:
Ensuring the validity of the memory at
dev_ptrbefore calling this function.Keeping
dev_ptrvalid before all associated Buffer objects are destroyed.Releasing the memory at
dev_ptronly after all associated Buffer objects are no longer in use.
- Parameters:
dev_ptr -- [in] Pointer to the starting memory address of a valid memory region on the Houmo device.
size -- [in] The size of the memory region in bytes for the Buffer object. This size must not exceed the allocated memory size at
dev_ptr.device_id -- [in] The logical device ID of the Houmo device on which the memory is located. The device 0 is used by default. You can retrieve the logical device IDs via SMI tool. See "SMI Tool User Guide" for details.
backend_name -- [in] The backend name of the Houmo device. Only the default value can be used. You can retrieve the backend name via Module::GetBackendName.
- Returns:
Returns a new Buffer object representing the specified memory region.
CreateHostBuffer
-
static Buffer tcim::Buffer::CreateHostBuffer(size_t size, void *ptr = nullptr)
Allocates a buffer on host CPU with the given memory size.
If a memory pointer is set in
ptr, the existing memory region will be used for allocation.- Parameters:
size -- [in] The size of buffer memory to be allocated in bytes.
ptr -- [in] A pointer to the starting memory address used for buffer allocation. Defaults to nullptr, which allocates a new memory. If
size` is set to0``, memory will not be allocated.
- Returns:
Returns a new Buffer object with memory allocated on the host.
Class CompTensor
-
class CompTensor
Composite tensor container supporting split/merge operations with offset tracking.
Manages hierarchical tensor decomposition and reconstruction while maintaining spatial relationships between sub-regions.
CompTensor
-
tcim::CompTensor::CompTensor(Tensor &&tensor)
Construct from rvalue tensor (move semantics).
- Parameters:
tensor -- [in] Temporary tensor source. Transfers ownership of tensor data to composite container, the data format of input tensor must be TCIM::DataFmt::CompND.
Warning
Original tensor becomes invalid after this operation.
-
tcim::CompTensor::CompTensor(Tensor &tensor)
Construct from lvalue tensor (copy semantics).
- Parameters:
tensor -- [in] Temporary tensor source. Transfers ownership of tensor data to composite container, the data format of input tensor must be TCIM::DataFmt::CompND.
Warning
Original tensor becomes invalid after this operation.
AsTensor
-
Tensor &tcim::CompTensor::AsTensor()
Convert to ND tensor representation.
- Returns:
Reference to underlying tensor storage.
Warning
Modifications may desynchronize with sub-tensor offsets.
GetInitStatus
-
Status tcim::CompTensor::GetInitStatus() const
Query composite structure initialization status.
- Example
StatusCode::ALLOC_FAILURE
- Example
See error handling in CompTensorDemo.cpp
- Returns:
Status object containing:
Status::OK
Status::UNINITIALIZED
SubTensorOffsets
-
std::vector<std::vector<int64_t>> &tcim::CompTensor::SubTensorOffsets() const
Access mutable offset coordinates of sub-tensors.
- Returns:
Reference to 2D vector where:
First dimension indexes sub-tensors.
Second dimension contains [N-dimensional offset coordinates].
Warning
Modifications may invalidate spatial consistency.
SubTensors
-
std::vector<tcim::Tensor> &tcim::CompTensor::SubTensors() const
Access immutable sub-tensor collection.
- Returns:
Reference to vector containing:
Ordered sub-tensor sequence.
Shared metadata with parent tensor.
Note
Sub-tensors maintain spatial relationships defined by offsets.
MergeCompNd
-
static Tensor tcim::CompTensor::MergeCompNd(const std::vector<Tensor> &roi_tensor, const std::vector<std::vector<int64_t>> &roi_offsets)
Reconstruct composite tensor from sub-regions.
- Example
See tensor reconstruction in CompTensorDemo::MergeDemo().
- Parameters:
roi_tensor -- [in] Vector of region-of-interest sub-tensors.
roi_offsets -- [in] Corresponding N-dimensional offsets.
- Returns:
Merged tensor satisfying:
shape == sum(sub_tensor_shapes)
dtype == sub_tensors[0].dtype()
format == tcim::Format::CompND
Note
Sub-tensors are origins tensor storage.
Class DevManager
-
class DevManager
Describes the device set on which the target model runs. For multi-GPU models, initialization must be performed using DevManager.
The DevManager class provides various methods to manage and operate device collections, including creating device manager instances, retrieving device counts, and checking initialization status. It is mandatory for multi-GPU models to use this class for proper device management.
DevManager
-
tcim::DevManager::DevManager() = default
Default constructor.
-
tcim::DevManager::DevManager(const DevManager&) = default
Default copy constructor.
- Parameters:
other -- The DevManager instance to copy.
-
tcim::DevManager::DevManager(DevManager&&) = default
Default move constructor.
- Parameters:
other -- The DevManager instance to move.
~DevManager
-
virtual tcim::DevManager::~DevManager()
Destructor.
DevCount
-
int tcim::DevManager::DevCount() const
Retrieves the number of devices.
- Returns:
The number of devices managed by this instance.
GetInitStatus
-
Status tcim::DevManager::GetInitStatus() const
Retrieves the initialization status.
- Returns:
The initialization status, of type Status.
operator!=
-
bool tcim::DevManager::operator!=(const DevManager &other) const
Inequality comparison operator.
- Parameters:
other -- The DevManager instance to compare with.
- Returns:
true if the two DevManager instances are not equal, false otherwise.
operator=
-
DevManager &tcim::DevManager::operator=(const DevManager&) = default
Default copy assignment operator.
- Parameters:
other -- The DevManager instance to copy.
- Returns:
Reference to the current object.
-
DevManager &tcim::DevManager::operator=(DevManager&&) = default
Default move assignment operator.
- Parameters:
other -- The DevManager instance to move.
- Returns:
Reference to the current object.
operator==
-
bool tcim::DevManager::operator==(const DevManager &other) const
Equality comparison operator.
- Parameters:
other -- The DevManager instance to compare with.
- Returns:
true if the two DevManager instances are equal, false otherwise.
Verify
-
Status tcim::DevManager::Verify() const
Validates all devices managed by this DevManager.
- Returns:
Returns Status::OK if all devices are valid, otherwise returns an error status.
Create
-
static DevManager tcim::DevManager::Create(const std::vector<int> &dev_indx, const std::string &backend_name = "")
Creates a DevManager instance using a list of device indices.
- Parameters:
dev_indx -- A list of device indices.
backend_name -- The backend name, default is an empty string.
- Returns:
A new DevManager instance.
-
static DevManager tcim::DevManager::Create(const std::vector<std::pair<std::string, int>> &devices)
Creates a DevManager instance using a list of device indices.
- Parameters:
devcies -- A list of pair device indices, first is backend_name second is device_id.
- Returns:
A new DevManager instance.
-
static DevManager tcim::DevManager::Create(int dev_id = 0, const std::string &backend_name = "")
Creates a DevManager instance.
- Parameters:
dev_id -- The device ID, default is 0.
backend_name -- The backend name, default is an empty string.
- Returns:
A new DevManager instance.
Class LogHandler
-
class LogHandler
Defines the interfaces for handling log messages generated by the library.
Notes
You must inherit from this class and implement the
OnLogmethod to redirect log output (e.g., to a file, console, or external logging system).The
OnLogmethod may be called from multiple internal threads concurrently. Therefore, the implementation must be thread-safe.Avoid performing heavy blocking operations within
OnLogto prevent impacting the performance of the main execution pipeline.
~LogHandler
-
virtual tcim::LogHandler::~LogHandler() = default
Destructor.
Flush
-
inline virtual void tcim::LogHandler::Flush()
Flushes any buffered log data.
This method is called when the logger needs to ensure all pending logs are written to the destination (e.g., during shutdown or critical errors).
OnLog
-
virtual void tcim::LogHandler::OnLog(std::string_view msg, int level) = 0
Callback function triggered when a log event occurs.
- Parameters:
msg -- [in] The formatted log message content.
level -- [in] The severity level of the log (corresponding to spdlog levels).
Class Module
-
class Module
Represents a module instance used for model inference in runtime, the weight manager will delay create in first time to call Module::LoadFromFile.
You can initialize modules using the following methods with the binary model file (.hmm or .hmms):
The static function: Module::LoadFromFile. For example:
auto module = Module::LoadFromFile("tcim_resnet50.hmm");
The constructor function: Module::Module. For example:
Module("tcim_resnet50.hmm");
The default constructor function. You can create an empty module and then load the binary model file with the Module::LoadModel method. For example:
Module module; module.LoadModel("tcim_resnet50.hmm");
Note
If a
Moduleobject is defined as a global variable, it must be explicitly destroyed before the main function exits.
Nested Class Option
-
class Option
Represents the configurations for initializing a Module object and loading the binary model file (.hmm or .hmms).
- Example
See example in WeightManager.
Option
-
explicit tcim::Module::Option::Option(int device_id = 0)
Constructs an Option object with the specified device ID.
- Parameters:
device_id -- [in] The logical ID of the Houmo device used for loading and inferring a model. The device 0 is used by default. You can retrieve the logical device IDs via SMI tool. See "SMI Tool User Guide" for details.
-
explicit tcim::Module::Option::Option(const DevManager&)
Constructs an Option object with the specified DevManager.
- Parameters:
dev_manager -- [in] The DevManager object that defines the device set on which the model will be loaded and run for inference.
-
explicit tcim::Module::Option::Option(Module::WeightManager&)
Constructs an Option object with the specified WeightManager.
- Parameters:
weight_manager -- [in] The WeightManager object to manage the shared weight memory.
EnableIOLazyMode
-
Option &tcim::Module::Option::EnableIOLazyMode(bool lazy = true)
Controls when input and output buffers used during inference are allocated on host.
When
lazyis set totrue, input and output buffers are allocated only when they are accessed for the first time. This avoids allocating buffers for unused execution paths, and reduces runtime memory usage. The first access to a buffer may incur allocation overhead.- Parameters:
lazy -- [in] Specifies when buffers used to store input and output data are allocated. Set to
trueto allocate buffers on first access. Set tofalseto allocate all buffers during model initialization.- Returns:
Reference to the Option object.
EnableHostLazyLoading
-
Option &tcim::Module::Option::EnableHostLazyLoading(bool lazy = false)
Controls lazy buffer allocation strategy during model loading phase.
When
lazyis set totrue, buffer allocation and initialization on the host are deferred until they are actually needed. This strategy reduces the peak host memory footprint during model loading via Module::LoadFromFile and Module::LoadModel APIs. This is beneficial for memory-constrained environments. The trade-off is increased model loading latency due to on-demand buffer initialization.When
lazyis set tofalse, all buffers are pre-allocated and initialized during model loading, resulting in faster model load completion but higher memory usage.- Parameters:
lazy -- [in] Specifies the buffer allocation strategy for model loading. Set to
trueto save host memory usage by deferring buffer allocation and initialization. Set tofalse(default) to speed up model loading.- Returns:
Reference to the Option object.
SetDummyTensors
-
Option &tcim::Module::Option::SetDummyTensors(std::vector<std::string> &dummy_names)
Sets a list of tensor names of a model.
By default, memory for input and output tensors is allocated during model loading. If this function is called, the memory is not allocated for these tensors when loading a model.
Notes
This function is used in scenarios where multiple models are inferred sequentially, with the output of one model as the input of another model. It optimizes memory usage by avoiding unnecessary memory allocation.
The memory for input and output tensors must be allocated before model inference. You can call Module::GetInput and Module::SetInput to set the output of one model as input of another model.
- Example
// ... // Create a weight manager auto weight_manager = tcim::Module::WeightManager::CreateWeightManager(0); // Create an Option object to set configurations for models tcim::Module::Option option1(weight_manager); tcim::Module::Option option2(weight_manager); // Set dummy tensor names std::vector<std::string> dummy_tensor_names = "model_layers"; option2.SetDummyTensors(dummy_tensor_names) // Load Qwen models auto prefill_part1_model = tcim::Module::LoadFromFile("qwen2_prefill_part1.hmm", option1) auto decode_part1_model = tcim::Module::LoadFromFile("qwen2_decode_part1.hmm", option2) // Get the output of prefill_part1_model auto kcache = prefill_part1_model.GetInput("model_layers") // Set the output of prefill_part1_model as input of decode_part1_model decode_part1_model.SetInput("model_layers", kcache)
- Parameters:
dummy_names -- [in] A vector of tensor names to be set.
- Returns:
A reference to the current Option object.
SetModelOffset
-
Option &tcim::Module::Option::SetModelOffset(size_t offset, size_t size)
Sets the model offset within the file or buffer.
This function allows model data to be stored at a non-zero offset within a file. It enables loading models that are embedded within larger files or combined with other data structures, such as:
Models packaged with metadata or other assets.
Multiple models concatenated in a single container file.
Models with prepended headers or format identifiers.
- Example usage:
// Load a model that starts at 1024 bytes into the file, with total size of 50MB SetModelOffset(1024, 50 * 1024 * 1024); // Load from the beginning of the file (default behavior) SetModelOffset(0, file_size);
- Parameters:
offset -- [in] The starting position (in bytes) of the model data within the file.
size -- [in] The size (in bytes) of the model data to be loaded from the specified offset.
- Returns:
Reference to the Option object for method chaining.
Note
Important considerations:
The offset must be properly aligned according to the model format requirements.
The size parameter must match the exact size of the model data to be loaded.
Ensure that the file is large enough to contain [offset, offset + size) bytes.
This setting may affect file I/O performance depending on the storage medium.
EnableProfile
SetInOutDftMemType
Nested Class RunOption
-
class RunOption
Represents dynamic options when the initializing or loading binary model files. Internal use only.
RunOption
Rounds
CoreMask
-
RunOption &tcim::Module::RunOption::CoreMask(const int64_t v)
Sets the core_mask option. Internal use only.
The core_mask uses binary format where bit i represents core i:
core0 = 0x1, core1 = 0x2, core2 = 0x4, core3 = 0x8
core0+core1 = 0x3, all four cores = 0xF
0 means no mask (use default configuration).
GetRounds
GetCoreMask
Nested Class WeightManager
-
class WeightManager
Represents weight manager reference for Module.
For large language models, like Qwen, if each weight within the models is allocated a block of device memory to store the weight values, it may lead to substantial memory consumption. To save device memory, modules can share their weight memory with a weight manager when performing multi-tasks within a module, or across multi-modules with the same weight values on the same Houmo device.
You can create this object by calling
tcim::Module::WeightManager::CreateWeightManager, and load modules with the same weight manager to share the weight memory by callingtcim::Module::LoadModel.Notes
When using tcim::Module::WeightManager::CreateWeightManager to create a weight manager object, the actual weight manager is not instantiated immediately. This object is initialized only when tcim::Module::LoadModel or tcim::Module::LoadFromFile is called with the provided Option object, which contains the WeightManager. This deferred instantiation allows for more efficient resource allocation during the model loading process.
- Example
The following example shows how to create two modules and a weight manager, and then load models with the same weight manager.
// Create Module objects: module_wm_1 and module_wm_2 auto module_wm_1 = std::make_shared<tcim::Module>(); auto module_wm_2 = std::make_shared<tcim::Module>(); { // Create a weight manager auto weight_manager = tcim::Module::WeightManager::CreateWeightManager(0); // Create a Option object to set configurations for loading models tcim::Module::Option option(weight_manager); // Load model with the weight manager module_wm_1->LoadModel("tcim_resnet50.hmm", option); module_wm_2->LoadModel("tcim_resnet50.hmm", option); // After loading the models, the option and weight manger could be released. } module_wm_1.reset(); //The weights will released here module_wm_2.reset();
CreateWeightManager
-
static WeightManager tcim::Module::WeightManager::CreateWeightManager(int device_id = 0)
Creates a weight manager for sharing weight memory.
- Example
See example in WeightManager.
- Parameters:
device_id -- [in] (Optional) The logical ID of the Houmo device on which the shared weights are stored and the model is inferred. The device 0 is used by default. You can retrieve the logical device IDs via SMI tool. See "SMI Tool User Guide" for details.
- Returns:
Returns a WeightManager object for managing shared weight memory.
-
static WeightManager tcim::Module::WeightManager::CreateWeightManager(const DevManager &dev_manager)
Creates a weight manager for sharing weight memory.
- Example
See example in WeightManager.
- Parameters:
dev_manager -- [in] The devices that need to share weight memory. It usually used in multi-device model (.hmms)
- Returns:
Returns a WeightManager object for managing shared weight memory.
GetInitStatus
-
Status tcim::Module::WeightManager::GetInitStatus() const
Retrieves the status of the constructor function and related resources.
- Returns:
Returns the status of the constructor function.
Module
-
tcim::Module::Module()
Default constructor.
-
tcim::Module::Module(const Module &module) = default
Default copy constructor.
- Parameters:
module -- [in] The Module object to be copied.
-
tcim::Module::Module(const std::string &filename, const Option& = Option())
Constructs a module with the given binary model file (.hmm or .hmms) and configuration options.
- Parameters:
filename -- [in] The file name and path of the binary model file.
options -- [in] The configuration options for module initialization, defined in Module::Option.
-
tcim::Module::Module(const void *data, uint64_t len, const Option& = Option())
Constructs a module from host memory with the given binary model file (.hmm or .hmms), size, and configuration options. The binary model file will be copied to Houmo device memory automatically for inference.
This API is used to load a confidential binary model file, ensuring its security and preventing unauthorized access. The model data should be decrypted before being passed to this API.
- Parameters:
data -- [in] Pointer to the memory containing the binary model file.
len -- [in] The size of the memory in bytes.
options -- [in] The configuration options for module initialization, defined in Module::Option.
~Module
DumpInOut
-
Status tcim::Module::DumpInOut(const std::string &path)
Dumps all current input and output tensors of the module to the specified directory.
Saves every input and output tensor as a
.npyfile (preserving the original tensor shape) and generates amodel.jsonfile in the target directory that is directly compatible with the TCIM tester (tests/tester/tester.py).Output directory layout
<path>/ model.json -- tester-compatible JSON (Golden + Model sections) input_<name>.npy -- one file per input tensor output_<name>.npy -- one file per output tensor
Notes
The directory at
pathis created automatically if it does not exist.All input tensors are transferred from device memory to contiguous host memory before being written to disk. Output tensors are already in contiguous host memory after
Module::Run+Module::Sync.The generated
model.jsoncan be passed directly to the tester viapython tester.py model.hmm --json <path>/model.json.
- Parameters:
path -- [in] Path to the output directory.
- Returns:
Returns the status of the function call.
GetBackendName
GetCoreNum
-
int64_t tcim::Module::GetCoreNum() const
Retrieves the number of IPU cores of the Houmo device used for model inference.
- Returns:
Returns the number of IPU cores used for model inferences.
GetCustomMsg
-
std::string tcim::Module::GetCustomMsg() const
Retrieves the the custom information embeded in the model at model compilation with the
custom_msgparameter.- Returns:
Returns the custom information string.
GetDevInput
-
Tensor tcim::Module::GetDevInput(const std::string &name)
Gets input tensor on Houmo device with the given tensor name.
If the tensor name is set via Module::Option::SetDummyTensors, this function returns UNINITIALIZED.
- Parameters:
name -- [in] The name of the input tensor to query for.
- Returns:
Returns the input tensor.
Warning
This function does not automatically copy tensor data to contiguous host CPU memory. You need to call Tensor::ToHost and Tensor::CastTo to explicitly copy and convert tensor data to host CPU memory if needed.
GetDevOutput
-
Tensor tcim::Module::GetDevOutput(const std::string &name)
Gets the original output data from pre-allocated with the given tensor name.
- Parameters:
name -- [in] The name of the output tensor to query for.
- Returns:
Returns the output tensor.
Warning
This function does not automatically copy tensor data to contiguous host CPU memory. You need to call Tensor::ToHost and Tensor::CastTo to explicitly copy and convert tensor data to host CPU memory if needed.
GetInitStatus
GetInput
-
Tensor tcim::Module::GetInput(const std::string &name)
Gets input tensor with the given tensor name.
If the tensor name is set via Module::Option::SetDummyTensors, this function returns UNINITIALIZED.
- Parameters:
name -- [in] The name of the input tensor to query for.
- Returns:
Returns the input tensor.
Warning
This function is deprecated and will be removed in the future release. Use Module::GetDevInput instead.
-
Status tcim::Module::GetInput(const std::string &name, Tensor &tensor)
Gets input data from pre-allocated memory with the given tensor name. The input data includes input tensors defined by the Tensor class.
Notes
If the original tensor data in host memory is not stored in a contiguous layout, the tensor data will be copied automatically to contiguous host CPU memory.
Before calling this function, you must pre-allocate the tensor parameter by creating a Tensor object. You can call tcim::Tensor::CreateDeviceTensor or tcim::Tensor::CreateHostTensor to create a tensor on Houmo device or host.
The pre-allocated tensor must match the model's input tensor in format, data type, and shape, as retrieved via Module::GetInputInfo. This ensures that the input tensor data can be correctly stored.
If the tensor name is set via Module::Option::SetDummyTensors, this function returns INVALID_ARGUMENT.
- Parameters:
name -- [in] The name of the input tensor to query for.
tensor -- [out] The input tensor.
- Returns:
Returns the status of the function call.
Warning
This function is deprecated and will be removed in the future release. Use Module::GetDevInput instead.
GetInputInfo
-
TensorInfo tcim::Module::GetInputInfo(const std::string &name) const
Gets the tensor information, such as tensor shape, data type, and format with the given input tensor name.
- Parameters:
name -- [in] The name of the input tensor to query for.
- Returns:
Returns the tensor information of the tensor.
GetInputName
-
std::string tcim::Module::GetInputName(int index) const
Gets the name of the index-th input tensor.
- Parameters:
index -- [in] The position of the input tensor in the network model to query for.
- Returns:
Returns the name of the index-th input tensor.
GetInputNum
-
size_t tcim::Module::GetInputNum() const
Gets the total number of input tensors in the network model.
- Returns:
Returns the total number of input tensors.
GetMemSize
-
const std::vector<size_t> &tcim::Module::GetMemSize()
Retrieves the memory size of the model in bytes.
Currently, this function is not supported.
- Returns:
Returns the size of the model in bytes. The elements of the return array represent the memory size of weight, workspace, inputs and outputs respectively.
GetModelVersion
-
const std::string &tcim::Module::GetModelVersion()
Retrieves the date when the model is compiled.
- Returns:
Returned the date when the model is compiled.
GetOutput
-
Tensor tcim::Module::GetOutput(const std::string &name)
Gets output data in a contiguous memory layout from pre-allocated memory with the given tensor name, and copies the data to host CPU memory as a new tensor.
- Parameters:
name -- [in] The name of the output tensor to query for.
- Returns:
Returns the output tensor.
Note
If the original tensor data in host memory is not stored in a contiguous layout, the tensor data will be copied automatically to contiguous host CPU memory.
-
Tensor tcim::Module::GetOutput(const std::string &name, Device device)
Gets output data from pre-allocated memory with the given tensor name.
- Parameters:
name -- [in] The name of the output tensor to query for.
device -- [in] The device type on which the output data is stored.
- Returns:
Returns the output tensor.
Warning
This function is deprecated and will be removed in the future release. Use Module::GetDevOutput instead.
-
Status tcim::Module::GetOutput(const std::string &name, Tensor &tensor)
Gets output data from pre-allocated memory with the given tensor name. The output data includes output tensors defined by the Tensor class.
Notes
Before calling this function, you must pre-allocate the tensor parameter by creating a Tensor object. You can call tcim::Tensor::CreateDeviceTensor or tcim::Tensor::CreateHostTensor to create a tensor on Houmo device or host.
The pre-allocated tensor must match the model's output tensor in format, data type, and shape, as retrieved via Module::GetOutputInfo. This ensures that the output data can be correctly stored.
- Parameters:
name -- [in] The name of the output tensor.
tensor -- [out] The output Tensor object.
- Returns:
Returns the status of the function call.
Note
This function should be called after Module::Run and Module::Sync.
GetOutputInfo
-
TensorInfo tcim::Module::GetOutputInfo(const std::string &name) const
Gets the information about the output tensor of the model inference, such as tensor shape, data type, and format with the given output tensor name.
- Parameters:
name -- [in] The name of the output tensor.
- Returns:
Returns the information of the output tensor.
GetOutputName
-
std::string tcim::Module::GetOutputName(int index) const
Gets the name of the index-th output tensor.
- Parameters:
index -- [in] The position of the output tensor in the network model to query for.
- Returns:
Returns the name of the index-th output tensor.
GetOutputNum
-
size_t tcim::Module::GetOutputNum() const
Gets the total number of output tensors in the network model.
- Returns:
Returns the total number of output tensors.
GetWorkspaces
LoadDumpInput
-
Status tcim::Module::LoadDumpInput(const std::string &path)
Loads input tensors previously saved by DumpInOut and feeds them directly into this Module.
Reads the
model.jsonproduced by a prior DumpInOut call, resolves the.npyfile path of each input tensor listed in theGolden.inputssection, loads the data into a temporary host Tensor, and forwards it to the Module viaModule::SetInput.Notes
The
pathmust point to a directory produced by a prior DumpInOut call.If the json contains an input name that is not recognised by this Module, a warning is logged and loading continues for the remaining inputs - the function still returns
Status::OK.Module must be initialised (
LoadModelcalled) before invoking this function.
- Parameters:
path -- [in] Path to the directory that contains
model.jsonand the associated.npyfiles.- Returns:
Returns the status of the function call.
LoadModel
-
Status tcim::Module::LoadModel(const std::string &filename, const Option& = Option())
Loads a model with the given the binary model file (.hmm or .hmms) and configuration options.
- Parameters:
filename -- [in] The file name and path of the binary model file.
Option -- The configuration options for loading the binary model file, defined in Module::Option.
- Returns:
Returns the status of the function call.
-
Status tcim::Module::LoadModel(const void *data, uint64_t len, const Option& = Option())
Loads a model from host memory with the given the binary model file (.hmm or .hmms), size, and configuration options. The binary model file will be copied to Houmo device memory automatically for inference.
This API is used to load a confidential binary model file, ensuring its security and preventing unauthorized access. The binary model data should be decrypted before being passed to this API.
- Parameters:
data -- [in] The data buffer of the binary model file.
len -- [in] The buffer length.
Option -- The configuration options for loading the binary model file, defined in Module::Option.
- Returns:
Returns the status of the function call.
operator bool
-
explicit tcim::Module::operator bool() const noexcept
Checks if the current Module object is initialized.
- Returns:
Returns true if the current Module object is initialized. Otherwise, returns false.
Note
You must initialize the modules first before any further processing on modules. See Module for detailed information on how to initialize modules.
operator!
-
bool tcim::Module::operator!() const noexcept
Checks if the current Module object is not initialized.
- Returns:
Returns true if the current Module object is not initialized. Otherwise, returns false.
Note
You must initialize the modules first before any further processing on modules. See Module for detailed information on how to initialize modules.
operator!=
operator=
operator==
Run
-
Status tcim::Module::Run(bool sync = false, const RunOption& = RunOption())
Infers a model.
- Parameters:
sync -- [in] (Optional) Specifies if to run the model synchronously. If set to true, the model runs synchronously. The default value is false.
RunOption -- The configuration options for model inference.
- Returns:
Returns the status of the function call.
SetDevInput
-
Status tcim::Module::SetDevInput(const std::string &name, const Tensor &tensor)
Sets the input data to the pre-allocated memory on Houmo device with the given tensor name. The input data includes input tensors defined with the Tensor class.
Notes
Before calling this function, you must pre-allocate the tensor parameter by creating a Tensor object. You can call tcim::Tensor::CreateDeviceTensor to create a tensor on Houmo device.
For multi-device models, the tensor must be on COMPND device.
The pre-allocated tensor must match all attributes of the corresponding model input tensor, with the same name specified in the
nameparameter, as retrieved via Module::GetInputInfo, and must be allocated on the same device type (Houmo device) and device ID as that input tensor.
- Parameters:
name -- [in] The name of the input tensor.
tensor -- [in] The input Tensor object.
- Returns:
Returns the status of the function call.
Warning
This function does not automatically copy tensor data to contiguous host CPU memory. You need to call Tensor::ToHost and Tensor::CastTo to explicitly copy and convert tensor data to host CPU memory if needed.
SetDevOutput
-
Status tcim::Module::SetDevOutput(const std::string &name, Tensor &tensor)
Sets the device output data to the pre-allocated memory on Houmo device with the given tensor name. This operation must be a inplace operation.
Note
Before calling this function, you must pre-allocate the tensor parameter by creating a Tensor object. You can call tcim::Tensor::CreateDeviceTensor to create a tensor on Houmo device.
For multi-device models, the tensor must be on COMPND device.
The pre-allocated tensor must match all attributes of the corresponding model output tensor, with the same name specified in the
nameparameter, as retrieved via Module::GetOutputInfo, and must be allocated on the same device type (Houmo device) and device ID as that output tensor.
- Parameters:
name -- [in] The name of the output tensor.
tensor -- [in] The output Tensor object.
- Returns:
Returns the status of the function call.
Warning
This function does not automatically copy tensor data to contiguous host CPU memory. You need to call Tensor::ToHost and Tensor::CastTo to explicitly copy and convert tensor data to host CPU memory if needed.
SetInput
-
Status tcim::Module::SetInput(const std::string &name, const Tensor &tensor)
Sets the input data to the pre-allocated memory on host or Houmo device with the given tensor name. The input data includes input tensors defined with the Tensor class.
Notes
This function can only be used with following data restrictions:
For "quantized fixed" data precision type, you must quantize the model before calling this function.
Data Format
Device Type
Data Precision
Data Alignment Required
CPU
fixed
NO
CPU
fixed
NO
CPU
fixed
NO
CPU
quantized fixed *
NO
HDPL
fixed
YES
HDPL
fixed
YES
HDPL
fixed
YES
HDPL
quantized fixed *
YES
- Parameters:
name -- [in] The name of the input tensor.
tensor -- [in] The input Tensor object. For non-image data, the input must be quantized.
- Returns:
Returns the status of the function call.
Note
If the original tensor data in host memory is not stored in a contiguous layout, the tensor data will be copied automatically to contiguous host CPU memory.
SetOutput
-
Status tcim::Module::SetOutput(const std::string &name, Tensor &tensor)
Sets the output data to the pre-allocated memory on Houmo device with the given tensor name. The output data is output tensors defined with the Tensor objects.
Note
The TensorInfo of the
tensorset in this function must match the TensorInfo returned by Module::GetInputInfo for the tensor with the same name specified in thenameparameter.- Parameters:
name -- [in] The name of the output tensor.
tensor -- [in] The output Tensor object.
- Returns:
Returns the status of the function call.
Note
If the original tensor data in host memory is not stored in a contiguous layout, the tensor data will be copied automatically to contiguous host CPU memory.
SetStream
Sync
-
Status tcim::Module::Sync()
Waits until the previous operation is completed.
- Returns:
Returns the status of the function call.
Note
If the
auto_yieldparameter was set totruewhen initializing the Stream object via the constructor, the IPU resources used by this stream will be automatically released after this function completes.
LoadFromFile
-
static Module tcim::Module::LoadFromFile(const std::string &filename, const Option& = Option())
Creates a module with the given binary model file (.hmm or .hmms) and configuration options.
- Parameters:
filename -- [in] The file name and path of the binary model file.
Option -- The configuration options for loading the binary model file, defined in Module::Option.
- Returns:
Returns a new Module object.
LoadFromMem
-
static Module tcim::Module::LoadFromMem(const void *data, uint64_t len, const Option& = Option())
Creates a module from host memory with the given binary model file (.hmm or .hmms), size, and configuration options. The file will be copied to Houmo device memory automatically for inference.
This API is used to load a confidential binary model file, ensuring its security and preventing unauthorized access. The model data should be decrypted before being passed to this API.
- Parameters:
data -- [in] Pointer to the memory containing the binary model file.
len -- [in] The size of the memory in bytes.
Option -- The configuration options for loading the binary model file, defined in Module::Option.
- Returns:
Returns a new Module object.
Note
Ensure that the binary model file to be loaded is a precise binary match to the one generated after compilation. Any modification may lead to undefined behavior after calling this API.
LoadModelInfo
-
static std::string tcim::Module::LoadModelInfo(const std::string &filename)
Loads and returns the model information as a JSON string from a binary model file (.hmm or .hmms).
This API parses the specified binary model file and extracts its metadata, returning a JSON-formatted string that describes the model information (e.g., version, inputs/outputs, tensor shapes, and other attributes).
- Parameters:
filename -- [in] The file name and path of the binary model file.
- Returns:
Returns a JSON string containing the model information.
Note
Ensure that the binary model file is a precise binary match to the one generated after compilation. Any modification may lead to undefined behavior or incorrect metadata being reported.
-
static std::string tcim::Module::LoadModelInfo(const void *data, size_t size)
Loads and returns the model information as a JSON string from host memory.
This API parses the binary model content provided in host memory and extracts its metadata, returning a JSON-formatted string that describes the model information (e.g., version, inputs/outputs, tensor shapes, and other attributes).
This API can be used with confidential model data. The model bytes should be decrypted before being passed to this API to ensure correct parsing.
- Parameters:
data -- [in] Pointer to the memory containing the binary model file.
size -- [in] The size of the memory in bytes.
- Returns:
Returns a JSON string containing the model information.
Note
Ensure that the provided memory buffer is a precise binary match to the compiled model output. Any modification may lead to undefined behavior or incorrect metadata being reported.
Class Stream
-
class Stream
Represents a stream that is used by the module for inference.
Notes
If a
Streamobject is defined as a global variable, it must be explicitly destroyed before the main function exits.When creating a Stream object, the actual stream is not instantiated immediately. This object is initialized only when tcim::Module::SetStream is called with the Stream object. This deferred instantiation allows for more efficient resource allocation during the model loading process.
Stream
-
explicit tcim::Stream::Stream(bool auto_yield = true)
Constructs a Stream object.
- Parameters:
auto_yield -- [in] Specifies if to automatically release IPU core resources after all tasks in the stream are completed. Defaults to
true.
GetInitStatus
operator=
Sync
-
Status tcim::Stream::Sync()
Waits for all tasks in the stream to complete.
- Returns:
Returns the status of the function call.
Note
If the
auto_yieldparameter was set totruewhen initializing the Stream object via the constructor, the IPU resources used by this stream will be automatically released after this function completes.
SyncYield
-
Status tcim::Stream::SyncYield()
Releases the IPU core resources that is used by the stream after performing Stream::Sync.
- Returns:
Returns the status of the function call.
Note
Before releasing the IPU resources, the Stream::Sync function is automatically invoked to ensure all operations in the stream are completed.
Class Tensor
-
class Tensor
Represents an input, output, or intermediate computation result.
Tensor
-
tcim::Tensor::Tensor()
Default constructor. The object constructed from this interface is invalid.
-
tcim::Tensor::Tensor(const Tensor &other) = default
Default copy constructor.
- Parameters:
other -- [in] Other Tensor object.
-
tcim::Tensor::Tensor(const TensorInfo &info, const Buffer &buffer)
Constructs a Tensor object with TensorInfo and no data pointer.
- Parameters:
info -- [in] The information of the tensor.
buffer -- [in] The buffer containing the tensor data.
AsFormular
-
Tensor tcim::Tensor::AsFormular() const
Creates a new Tensor object with normalized TensorInfo.
The returned Tensor shares the same underlying buffer but has a normalized TensorInfo (via TensorInfo::AsFormular()).
- Returns:
Returns a new Tensor object with normalized info.
AsType
-
Tensor tcim::Tensor::AsType(DataType dtype, bool auto_cast = true) const
Creates a new tensor with the specified target data type, and optionally casts the data of original tensor with the target data type, stores casted data in the new tenor.
The new tensor is created with a contiguous memory layout in host memory. The new tensor does not contain any quantization information (QuantInfo).
Note:
If
auto_castis set tofalse, the new tensor is created without storing any data. To store data from original Tensor object, setauto_casttotrue, which will automatically cast data to the target data type, and stores data in the new tensor.- Parameters:
dtype -- [in] The target data type for the new tensor, defined in DataType.
auto_cast -- [in] Specifies if to automatically cast the data of original Tensor object to the target data type specified in
dtype. If set totrue, the data will be automatically cast to the target data type with Tensor::CastTo, and stored in the new tensor. If set tofalse, the new tensor will be created with target data type without performing data casting. Defaults totrue.
- Returns:
Returns a new Tensor object with the specified data type without quantization information.
Note
This function is only valid for non-image (multi-dimensional) tensors.
Buffer
CastTo
-
Status tcim::Tensor::CastTo(Tensor &tensor) const
Casts the data of the current Tensor object to the specified data type, and stores the casted data in the given target tensor.
The supported data type conversions are as follows:
From/To
INT8, INT16, INT32
FLOAT32
FLOAT16
INT8, INT16, INT32
Not Supported
Supported
Supported
FLOAT32
Supported
NA
Supported
FLOAT16
Supported
Supported
NA
- Parameters:
tensor -- [inout] The target tensor where the casted data is stored. The data type of this tensor determines the target type for the cast operation. The data of the current Tensor object is cast to the data type specified in this parameter, and the casted data is stored in this parameter.
- Returns:
Returns the status of the function call.
Note
This function only supports contiguous tensors in host memory.
Clone
-
Tensor tcim::Tensor::Clone(bool auto_copy = true)
Creates a copy of the current Tensor object.
- Parameters:
stream -- [in] The stream used for performing the deep copy. This is used only used if
auto_copyis true.auto_copy -- [in] Determines if the data stored in the buffer is copied to the new Tensor object.
If set to
true(default), a new buffer is allocated, and both the data stored in the buffer and its associated metadata (such as size and memory type) are copied to the new buffer.If set to
false, a new buffer is allocated, and only the metadata (such as size and memory type) is copied to the new buffer, but its memory is uninitialized and does not contain the original data.
- Returns:
Returns a new Tensor object as a copy of the current one.
CopyTo
-
Status tcim::Tensor::CopyTo(Tensor &dst) const
Copies the tensor data of the current Tensor object to another Tensor object.
Notes
Before calling this function, make sure the format, shape, and data type of source and target tensors are identical and valid.
This function automatically handles data alignment when required.
- Parameters:
dst -- [inout] The Tensor object to copy to.
- Returns:
Returns the status of the function call.
Note
Data may be modified during copying from the source to target due to differences in device type, contiguity requirements, and other data characteristics.
Data
-
void *tcim::Tensor::Data() const
Retrieves the pointer to the memory address where the tensor data is stored.
- Returns:
Returns a pointer to the data stored in the tensor.
Device
DeviceId
-
int tcim::Tensor::DeviceId() const
Retrieves the logical ID of the Houmo device on which the tensor is stored.
- Returns:
The logical ID of the Houmo device on which the tensor is stored.
GetInitStatus
Info
-
const TensorInfo &tcim::Tensor::Info() const
Retrieves the information of the tensor associated with the current Tensor object.
- Returns:
Returns the information of the tensor.
MemSize
-
size_t tcim::Tensor::MemSize() const
Retrieves the actual allocated memory size allocated for the tensor.
If the tensor is stored within a buffer, the buffer size, as returned by Buffer::Size, should be equal to or greater than the value returned by this function.
- Returns:
Returns the minimum memory size required to store the tensor data.
operator!=
operator=
operator==
SelectBatch
-
Tensor tcim::Tensor::SelectBatch(const std::vector<int64_t> &d) const
Selects a batch of elements along the batch dimension from the original Tensor object, and returns a new Tensor object that shares the underlying memory with the original Tensor object.
The returned tensor and the original tensor use the same buffer. However, the actual memory addresses accessed vary depending on the batch selection, as different batch indices map to different memory regions within the same buffer.
- Parameters:
d -- [in] A reference to a vector of indices specifying the batch elements to select.
- Returns:
Returns a new Tensor object that referencing the selected batch of elements form the original tensor.
Warning
The returned tensor becomes invalid if the original tensor is destroyed. You must ensure that the original tensor remains valid for a longer duration than the returned tensor to avoid undefined behavior.
SelectROI
-
Tensor tcim::Tensor::SelectROI(const std::vector<int64_t> &roi_start, const std::vector<int64_t> &shape) const
Selects a Region of Interest (ROI) from the current Tensor and creates a new Tensor.
This method supports N-dimensional (ND) Tensors. The
roiandshapeparameters define the region to be extracted.- Example
// Assume 'tensor' is a 3D Tensor with shape [10, 20, 30] std::vector<int64_t> roi_start = {2, 3, 5}; // Start indices for each dimension std::vector<int64_t> shape = {3, 7, 20}; // Shape of the ROI // - Dimension 1: start=2, size=3 (elements 2, 3, 4) // - Dimension 2: start=3, size=7 (elements 3, 4, 5, 6, 7, 8, 9) // - Dimension 3: start=5, size=20 (elements 5, 6, ..., 24) Tensor roi_tensor = tensor.SelectROI(roi, shape); // roi_tensor will have shape [3, 7, 20]
- Parameters:
roi_start[in] -- A std::vector<int64_t> representing the starting indices of the ROI. The size of
roimust match the number of dimensions of the Tensor. Each element inroispecifies the starting index for the corresponding dimension:[start_dim1, start_dim2, ..., start_dimN].shape[in] -- A std::vector<int64_t> representing the shape of the ROI. The size of
shapemust match the number of dimensions of the Tensor. Each element inshapespecifies the size (number of elements) of the ROI along the corresponding dimension:[size_dim1, size_dim2, ..., size_dimN].
- Returns:
A new Tensor containing the data from the selected ROI. Returns an empty Tensor if the ROI is invalid (e.g., out of bounds). May throw an exception in some implementations for invalid ROI.
SplitYUV
-
Status tcim::Tensor::SplitYUV(Tensor &y, Tensor &uv) const
Splits a YUV-formatted tensor into separate Y and UV tensors without additional memory allocation.
This function extracts the Y (luminance) and UV (chrominance) components from a YUV tensor formatted as YUV420SP, YUV422SP, or YUV444SP. The resulting Y and UV tensors are stored as non-image format tensors and share the same underlying memory buffer as the original YUV tensor, avoiding additional memory allocation.
- Parameters:
y -- [out] Reference to a non-image tensor that stores the Y component.
uv -- [out] Reference to a non-image tensor that stores the interleaved UV components.
- Returns:
Returns the status of the function call.
Note
The split tensors share the same underlying memory as the original YUV tensor, so their lifecycle is tied to the original tensor.
This function can be used to split YUV tensors in both host and device memory.
ToHost
-
Tensor tcim::Tensor::ToHost(bool to_contiguous = false) const
Retrieves or creates a tensor in host memory with tensor data based on the current Tensor object.
The following table summarizes the behavior of this function:
Original Tensor Location : The memory location of the original tensor.
Is Contiguous : Whether the memory layout of the original tensor is contiguous.
to_contiguous Setting : The value of the
to_contiguousparameter.Returned Tensor : Describes the returned tensor after performing this function, including if the returned tensor has contiguous or non-contiguous memory layout on host.
Original Tensor Location
Is Contiguous
to_contiguous Setting
Returned Tensor
On Houmo device memory
Yes
true/false
New tensor, contiguous.
On Houmo device memory
No
true
New tensor, contiguous.
On Houmo device memory
No
false
New tensor, non-contiguous.
On Host memory
Yes
true/false
Original tensor, contiguous.
On Host memory
No
true
New tensor, contiguous.
On Host memory
No
false
Original tensor, non-contiguous.
- Parameters:
to_contiguous -- [in] Specifies if to return the tensor with contiguous or non-contiguous memory layout. If set to
true, the returned tensor is stored on host with contiguous in memory. If set tofalse, the returned tensor preserves its original memory layout. Defaults tofalse.- Returns:
Returns a Tensor object representing the tensor stored in host memory.
CreateDeviceTensor
-
static Tensor tcim::Tensor::CreateDeviceTensor(const TensorInfo &info, size_t mem_size = 0, int device_id = 0, const std::string &backend_name = "")
Creates a tensor and allocates memory on Houmo device.
- Parameters:
info -- [in] The information of the tensor.
mem_size -- [in] The memory size of the tensor buffer's size auto allocated in bytes. it must equal or larger than tensor info's MemSize()
device_id -- [in] The logical device ID of the Houmo device on which the buffer memory is allocated. The device 0 is used by default. You can retrieve the logical device IDs via SMI tool. See "SMI Tool User Guide" for details.
backend_name -- [in] The backend name of the Houmo device. Only the default value can be used. You can retrieve the backend name via Module::GetBackendName.
- Returns:
Returns a Tensor object created on Houmo device.
CreateHostTensor
-
static Tensor tcim::Tensor::CreateHostTensor(const TensorInfo &info, size_t mem_size = 0, void *ptr = nullptr)
Creates a tensor on host.
- Parameters:
info -- [in] The information of the tensor.
mem_size -- [in] The memory size of the tensor buffer's size auto allocated in bytes. it must equal or larger than tensor info's MemSize()
ptr -- [in] Pointer to pre-allocated memory for the tensor on host. If
nullptr, a new memory is allocated on host.
- Returns:
Returns a Tensor object created in the host memory.
Class TensorInfo
-
class TensorInfo
Represents information about a tensor, including its shape, data type, memory layout, and other relevant attributes. This class provides methods to retrieve tensor information.
TensorInfo
-
tcim::TensorInfo::TensorInfo()
Default constructor.
-
tcim::TensorInfo::TensorInfo(const TensorInfo &other) = default
Default destructor.
Default copy constructor.
- Parameters:
other -- [in] A TensorInfo object.
-
tcim::TensorInfo::TensorInfo(TensorInfo &&other) = default
Default move constructor.
- Parameters:
other -- [in] A TensorInfo object.
AsContiguous
-
TensorInfo tcim::TensorInfo::AsContiguous() const
Creates a TensorInfo object with updated memory layout information to contiguous for the tensor.
This function generates a new TensorInfo object based on the current TensorInfo object, updating the memory layout information to indicate that the tensor is stored contiguously without strides. This is typically used for tensor stored on host.
You can create tensors on host via Tensor::CreateHostTensor with the tensor information created by this function.
- Example
host_tensor = Tensor::CreateHostTensor(dev_tensor.Info().AsContiguous())
- Returns:
Returns a new TensorInfo object with updated memory layout information set to contiguous.
AsFormular
-
TensorInfo tcim::TensorInfo::AsFormular() const
Creates a TensorInfo object with normalized shape and strides.
This function generates a new TensorInfo object where:
Rank0 (scalar) shapes are converted to Rank1[1].
Strides are populated if they were empty.
This is useful for loose shape matching where Rank0 and Rank1[1] are considered equivalent.
- Returns:
Returns a new TensorInfo object with normalized.
AsType
-
TensorInfo tcim::TensorInfo::AsType(DataType dtype) const
Creates a TensorInfo object with updated data type based on the current TensorInfo object.
- Parameters:
dtype -- [in] The new data type defined in DataType.
- Returns:
Returns a new TensorInfo object with the updated data type.
Clone
-
TensorInfo tcim::TensorInfo::Clone() const
Creates a copy of the current TensorInfo object.
- Returns:
A new TensorInfo object that is a clone of the current object.
DataType
-
tcim::DataType tcim::TensorInfo::DataType() const
Retrieves the data type of the tensor associated with the current TensorInfo object.
- Returns:
Returns the data type of the tensor.
DataTypeSize
-
size_t tcim::TensorInfo::DataTypeSize() const
Retrieves the memory size (in bytes) of a single element in the tensor based on the data type of the tensor.
For example, if the data type of a tensor is
FLOAT32, this function returns4, as aFLOAT32value typically occupies 4 bytes in memory.- Returns:
Returns the memory size (in bytes) of a single element in the tensor.
Format
-
tcim::DataFmt tcim::TensorInfo::Format() const
Retrieves the data format of the tensor associated with the current TensorInfo object.
- Returns:
Returns the data format of the tensor.
GetInitStatus
-
Status tcim::TensorInfo::GetInitStatus() const
Retrieves the status of the constructor function and related resources.
- Returns:
Returns the status of the constructor function.
GetQuantInfo
-
std::shared_ptr<QuantInfo> tcim::TensorInfo::GetQuantInfo() const
Retrieves the quantization information of the tensor associated with the current TensorInfo object.
- Returns:
Returns a shared pointer to the quantization information of the tensor.
IsContiguous
-
bool tcim::TensorInfo::IsContiguous() const
Checks if the tensor data associated with the current TensorInfo object has a contiguous memory layout. A contiguous tensor has no strides or its strides are aligned with the shape of the tensor. For example:
strides[n] = shape[n-1] * shape[n-2] *... shape[0]where:
n: The index of the tensor dimension.strides[n]: The number of elements to move across dimension n.shape[n]: The size of the n-th dimension of the tensor.
- Returns:
Returns
trueif the tensor has a contiguous memory layout.
IsMatch
-
bool tcim::TensorInfo::IsMatch(TensorInfo &other) const
Checks if the associated tensors match for mutual copying.
The associated tensors only match if the following requirements are met:
The name, format, shape, and precision of source and target tensors must be identical and valid.
- Parameters:
other -- [in] The TensorInfo object to compare with.
- Returns:
Returns true if the associated tensors match for mutual copying. Otherwise, returns false.
MemSize
-
size_t tcim::TensorInfo::MemSize() const
Retrieves the memory size (in bytes) allocated for the tensor associated with the current TensorInfo object.
Note
For non-contiguous tensors, the returned memory size may be equal to or larger than the actual size of the tensor due to alignment requirements on Houmo device memory.
Only for the contiguous tensor, Size() equals to MemSize().
- Returns:
Returns the memory size of the tensor in bytes.
operator!=
-
bool tcim::TensorInfo::operator!=(const TensorInfo &other) const noexcept
Checks if the current TensorInfo object is not equal to another TensorInfo object.
- Parameters:
other -- [in] The TensorInfo object to compare with.
- Returns:
Returns true if the two TensorInfo objects are not equal. Otherwise, returns false.
operator=
-
TensorInfo &tcim::TensorInfo::operator=(const TensorInfo &other) = default
Default copy assignment operator. Assigns the tensor information of a TensorInfo object to the current TensorInfo object.
- Parameters:
other -- [in] The TensorInfo object to copy from.
- Returns:
Reference to the current TensorInfo object.
-
TensorInfo &tcim::TensorInfo::operator=(TensorInfo &&other) = default
Default move assignment operator. Assigns the tensor information of a TensorInfo object to the current TensorInfo object.
- Parameters:
other -- [in] The TensorInfo object to copy from.
- Returns:
Reference to the current TensorInfo object.
operator==
-
bool tcim::TensorInfo::operator==(const TensorInfo &other) const noexcept
Checks if the current TensorInfo object is equal to another TensorInfo object.
- Parameters:
other -- [in] The TensorInfo object to compare with.
- Returns:
Returns true if the two TensorInfo objects are equal. Otherwise, returns false.
Shape
-
const std::vector<int64_t> &tcim::TensorInfo::Shape() const
Retrieves the shape of the tensor associated with the current TensorInfo object.
- Returns:
Returns the shape of the tensor.
SpanSize
-
size_t tcim::TensorInfo::SpanSize() const
Retrieves the padding size in bytes for memory alignment at the end of the memory block used to store the tensor.
This function returns the size of the memory allocated at the end of the memory block to meet the memory alignments. The returned memory size is not part of the actual data size of the tensor, it is only for memory alignments.
- Returns:
Returns the padding size in bytes for alignment at the end of the memory block for storing the tensor.
Stride
-
const std::vector<int64_t> &tcim::TensorInfo::Stride() const
Retrieves the stride of the tensor associated with the current TensorInfo object. The stride represents the number of bytes to be skipped in memory to move from one element to another along each dimension.
- Returns:
Returns a reference to a vector containing the stride for each dimension of the tensor.
CreateNDInfo
-
static TensorInfo tcim::TensorInfo::CreateNDInfo(const std::vector<int64_t> &shape, const DataType type, const std::vector<int64_t> &stride = {}, const size_t span_size = 0)
Creates a TensorInfo object for a non-image (ND) tensor with the specified shape, data type, stride, and span size.
- Parameters:
shape -- [in] A vector specifying the size of each dimension of the tensor.
type -- [in] The data type of the tensor defined in DataType.
stride -- [in] (Optional) A vector specifying the stride for each dimension. The stride defines the memory layout and the number of bytes to move between adjacent elements along each dimension. Defaults to an empty vector, indicating a contiguous memory layout.
span_size -- [in] (Optional) The number of padding in bytes used for memory alignment. Defaults to
0.
- Returns:
Returns a TensorInfo object representing information about a multi-dimensional non-image tensor.
CreateYUVInfo
-
static TensorInfo tcim::TensorInfo::CreateYUVInfo(int64_t n, int64_t w, int64_t h, DataFmt yuv_format)
Creates a TensorInfo object for a batch of YUV images with the specified batch size, width, height, and format.
This function is used to create a TensorInfo for a batch of YUV images.
- Parameters:
n -- [in] The number of YUV images in the batch.
w -- [in] The width of each YUV image.
h -- [in] The height of each YUV image.
yuv_format -- [in] The format of the YUV image defined in DataFmt.
- Returns:
Returns a TensorInfo object representing information about a tensor that contains a batch of YUV images.
-
static TensorInfo tcim::TensorInfo::CreateYUVInfo(int64_t w, int64_t h, DataFmt yuv_format)
Creates a TensorInfo object for a YUV image with the specified image width, height, and format.
This function is used to create a TensorInfo for a single YUV image.
- Parameters:
w -- [in] The width of the YUV image.
h -- [in] The height of the YUV image.
yuv_format -- [in] The format of the YUV image defined in DataFmt.
- Returns:
Returns a TensorInfo object representing information about a image tensor.
MergeYUV
-
static TensorInfo tcim::TensorInfo::MergeYUV(const TensorInfo &y, const TensorInfo &uv)
Creates a TensorInfo object for a merged YUV tensor with the given tensor information of Y tensor (Y component) and UV tensor (UV component).
This function combines the information of Y and UV tensors into a single TensorInfo object that represents the information of a complete YUV image. Use this function when you have separate Y and UV tensors, and need to create a TensorInfo object for a single merged YUV tensor.
- Parameters:
y -- [in] The TensorInfo object for the Y tensor, representing the Y component of the YUV image.
uv -- [in] The TensorInfo object for the UV tensor, representing the UV component of the YUV image.
- Returns:
Returns a TensorInfo object representing the information the merged YUV tensor.
