Groups



Login

Introduction & Setup



Host Software

Driver

There are two drivers:

  1. PCI Express kernel mode driver for Windows/Linux, used for accessing the SI-C667xDSP card as a PCIe endpoint device.
  2. Optional USB Stellaris driver for Windows/Linux from TI, used for accessing the SI-C667xDSP card's configuration µController (MCU) via the µUSB port. Mainly used to change the card's boot mode and thereby enable standalone operation of the SI-C667xDSP card.

SI-PCIe Windows Driver Installation

To install a Windows driver for an SI-C667xDSP board booting as a PCIe endpoint device, the following files are required: sic667x.cat, SIC667xDriver.inf, SIC667xDriver.sys, and WdfCoInstaller01009.dll.

  1. Inside the 'Device Manager', right click the SI device and choose 'Update Driver'.
  2. Click 'Browse My Computer' for driver software.
  3. Click 'Let Me Pick From A List Of Device Drivers' on my computer.
  4. Click 'Have Disk...'
  5. Click 'Browse...'
  6. Go to the path where the driver files reside and choose the 'SIC667xDriver.inf'.
  7. Click 'Next'. The the driver should be correctly installed.

SI-PCIe Linux Driver Installation

TBD

Optional USB Bulk/Normal Mode Windows Driver Installation for Tiva MCU

If the SI-C667xDSP board is booting as a standalone card through either the SPI or the JTAG ports, the host computer will only have access to the MCU's µUSB port and NOT the DSP's disabled PCIe port. Therefore, it would be imperative to also install TI's TivaWare for TM4C Series package in order to independently communicate with the onboard MCU's µUSB port when the DSP's PCIe port is disabled.

The TM4C123FH6PM MCU is used to control and monitor the peripheral circuitry, as well as to configure the SI-C667xDSP boot modes. Using the same steps for installing the SI-PCIe Windows driver and also inserting a cable to connect the host computer's USB and the MCU's µUSB ports, use the following link to download the necessary driver files for the TM4C123FH6PM device directly from TI's site:

Click on the Download Options button to obtain the latest TivaWare for TM4C Series package, named SW-TM4C-2.2.0.295.exe. After extraction, go to the installed \ti\TivaWare_C_Series-2.2.0.295\windows_drivers subfolder to find the usb_dev_bulk.inf setup information file. Note that it uses the standardized USB driver but with TI's own wrappers and DLL support files.

Detecting Proper Boot after Power Up.

There are two sets of 3 LEDs, with each set comprising a red, yellow, and green LED:

  1. 3 LEDs controlled by an on system MCU (Tiva µController). These 3 LEDs are located closest to the µUSB port (farthest away from the bracket), and controlled by the onboard MCU. When the DSP has properly initialized and booted into the I2C-PCIe mode, the green LED should light up. There are different LED patterns when booting into other modes; further enhancements forthcoming.
  2. 3 LEDs controlled by an on Optional FPGA. These 3 LEDs are located just above the FMC expansion connector (closest to the bracket), and controlled by the optional FPGA. If there is NO FPGA installed, these LEDs will NOT be functional. The functionality of these LEDs are to be determined by the user.

Application Software

SI provides several Host level applications and demos to interact with the DSP. Please note that all Host applications will have an associated DSP project, and is comprised of two sections: 1) Main menu for interacting with the DSP over the PCIe bus, and 2) the Tiva Utility Functions menu for independently interacting with the Tiva MCU over the USB port for configuring boot modes and overall board control stored inside it's flash memory.

NOTE: For Windows, all host applications must be run in Administrator mode.

HostExampleApp Tutorial - Main Menu


HostExampleApp is a cmd line project based on the SI_PCIe library. SI_PCIe library provides API calls that allow users to communicate with SI-C667xDSP embedded boards over the PCIe bus without worrying about the low level details of drivers. HostExampleApp is a simple example that provides some basic simple cases for users to familiarize themselves with SI_PCIe APIs. Below is a breakdown of the Main menu and how to use each case as well as a summary of their functionality:

1. Load EABI File
This option enables loading of DSP binary files onto SI-C667xDSP boards. EABI files (.out file, previously referred to as COFF file) are DSP binaries or executables that are the outputs of projects compiled by TI's tool chains. The SI-C667xDSP is a multi-core DSP carrier board, where each core is in an idle state when the board is powered on. DSP binary files are separately loaded into each core. After a DSP binary file is loaded, a DSP side interrupt is triggered for the core to start running.

Steps for Load EABI File
Enter EABI filename:
specify the path of the DSP binary file to be uploaded
Enter Core:
select which core to load the binary
Start Core: (y/n)?
manually start the core to run or not

2. Read Address
3. Write Address
These two options allow direct access to single 32-bit memory address locations within the DSP. A brief accessible memory map will be displayed. Please note that the displayed memory addresses reflect the DSP's address space and not the Host's.

Steps for Read/Write DSP Address
Enter address : 0x
enter a hexadecimal DSP address to access
(for write) Enter data : 0x
enter a hexadecimal 32-bit binary data value to be written

4. Read Address Buffer
5. Write Address Buffer
These two options are similar to options '2' & '3', but instead allows access to a contiguous block of memory address spaces on the DSP side.

Steps for Read/Write Address Buffer
Enter address : 0x
enter a hexadecimal DSP address to access
Enter count :
enter the number of elements to access
Enter Width (0:Byte, 1:Short: 2:Int):
enter the width for each data element
(for write) Enter data : 0x
enter the hexadecimal binary data to be written

6. DMA Read
7. DMA Write
These options allow access to DSP memory spaces via the TI EDMA hardware module. To begin with, default parameters could be used to schedule the DMA transfers. For more details about the TI EDMA engine and its use, please refer to the EDMA3 section.

8. Program FPGA
Load the FPGA binary in the form of a bitfile into the SI-C667xDSP's optional onboard FPGA device

9. Access PCIESS Registers
This option allows user to read information assigned by the Host OS that are placed inside the PCIESS Registers.

A. Read Config Space
B. Write Config Space
These two options allow access to the SI-C667xDSP's configuration space as mapped by the Host OS. Please note that it is not recommended to modify any value in the config space, as doing so may create havoc for the Host OS.

C. Messaging
This option allows interrupt based messaging between the Host and the DSP while the DSP is actively running code, and is described in more detail below.


D. Read Static Address
E. Write Static Address
These two options allow access to the SI-C667xDSP's memory space using static addresses.

N. Load SPI NOR
This option allows loading the NOR flash with the required configuration data in the case the SPI extended boot mode is required, and therefore MUST be completed before selecting the SPI extended boot mode inside the Set Boot Mode menu option under the Tiva Utility Functions section described in more detail below. The following accompanying files are required to be placed inside the same folder as the HostExampleApp: SPI_writer.dat, SPI_writer.out, and usb_i2C.bin.


R. DSP Local Reset
This option allows users to reset the DSP cores along with the SI-C667xDSP hardware peripherals. Please note that there are options to either reset single DSP cores or all DSP cores and peripherals.

Steps for DSP Local Reset
Reset a single core (S) or Reset all cores and peripherals (A):
Reset a single core or all cores and peripherals simultaneously
(for reset single)Enter a core id to do DSP local reset:
Choose which core to reset

S. DSP Start
This option allows users to restart the DSP if it's been halted from running its application after invoking the DSP Reset function described above. Unnecessary if the DSP's application binary will be reloaded using the Load EABI File described in more detail above.


M. Basic Calculator
This option allows users to interact with a simple calculator DSP program. To use this option, the sample.out binary should be loaded into core0 first.

Steps for Basic Calculator
Select operation (1 - add / 2 - subtract / 3 - multiply / 4- divide) :
select which math operation to pass into the DSP program
Enter input 1:
enter 32-bit input1
Enter input 2:
enter 32-bit input2

HostExampleApp Tutorial - Tiva Utility Functions Menu

The dedicated Tiva Utility Functions menu will allow independent access to the onboard MCU's internal flash via its µUSB port irrespective of the SI-C667xDSP board's PCIe bus or boot mode. The MCU is used to control and monitor the peripheral circuitry, as well as to configure the SI-C667xDSP boot modes, and therefore it is NOT recommended to be used. In order for these implement operations, place a USB to µUSB cable to connect the host computer's USB to the SI-C667xDSP's µUSB port.

T. Tiva Bulk/Normal Mode Operations (I2C Flash Access and Boot Modes)
This options opens up the available functions to access the MCU's internal flash.

5. Read Flash
6. Erase Flash
7. Write Flash
These three options allow accesses with DWord values entered one byte at a time to parts of the MCU' s internal flash, where a brief accessible memory map will be displayed for reference.

A. Set Boot Mode
These four options allow one to set the SI-C667xDSP's boot mode for the next power cycle.

  1. PCIe Endpoint (I2C Extended Boot Mode) - default boot mode. The SI-C667xDSP board is configured to boot in I2C master mode, and then immediately redirected to boot with an PCIe port configured as an endpoint device.
  2. SPI (I2C Extended Boot Mode). The SI-C667xDSP board is configured to boot in I2C master mode, and then immediately redirected to boot from a pre-configured SPI NOR flash. Please note that the SI-C667xDSP's PCIe port is DISabled, and therefore can only be invoked AFTER the Load SPI NOR option (described above) from within the Main menu has already completed since the NOR flash can only be accessed via the PCIe bus.
  3. PCIe Endpoint. Hardwired and only intended for debugging and therefore NOT recommended.
  4. No Boot. Only JTAG, mainly for debugging. Please note that the SI-C667xDSP's PCIe port is DISabled.

B. Generate Log
Returns logged data for debugging purposes only.


Qt Demos

SI provides several Host side demo projects developed within the Qt framework, and are designed to work in conjunction with their corresponding DSP side demo projects.

SampleQt
SampleQt is the GUI version of the HostExampleApp.

EDMAQt
Demos/EDMAQt is a GUI project that is intended to familiarize the TI EDMA engine. Please refer to the EDMA3 for more information.

IPCQt
Demos/IPCQt is a GUI project that is intended to familiarize the TI IPC package MessageQ submodule. It will load TI IPC Demo DSP side project. Please refer to IPC for more information about the IPC module.


SI_PCIe

SI_PCIe is an API library for users to communicate with the SI-C667xDSP board via the SI C667x driver. HostExampleApp and SampleQt projects demonstrate the basic use cases of SI_PCIe. API function details may be found inside the HostExampleApp project. The following section focuses on some fundamental concepts behind the API library.

PCIe Address Translation

The TI Keystone platform uses its PCIe core module to translate memory addresses across PCIe buses. A BAR (base address register) would be configured to translate memory address locations on a Host to the DSP or memory addresses on the DSP to the Host. There are 6 BARs assigned to the SI-C667xDSP board, with BAR 0 being reserved as required by the PCI spec.

When using the Read/Write functions of SI_PCIe, users can pass in a BAR number and a boolean value to tell the function whether or not to configure and use the passed in BAR automatically. If a 'False' is passed in, the BAR configuration would not be implemented automatically by the function, but rather manually where is it up to the user to properly configure a BAR before calling Read/Write functions. cfgBAR_32b is the function to configure a specific BAR. Configuring a BAR introduces a small overhead. For improved performance or for repeated accesses to the same memory region, manual configuration is recommended.

For more details of the TI PCIe module, please refer to the PCIe section.

Read/Write Accesses Using Target and DMA

SI_PCIe provides two different methods to access DSP memory address locations: Read/Write accesses using target or DMA.

1. Target Read/Write
Target Read/Write accesses requires setting up a Host PCIe address translation window within a BAR. The SI-C667x driver will copy the passed in user buffer to the BAR.

2. DMA Read/Write
DMA read/write accesses require a configured BAR to schedule transactions by the TI EDMA engine, while the data transfers themselves do not need a BAR. For large contiguous memory accesses, DMA provides better performance.

For repeated small or fragmented memory accesses, target read/write accesses may render better performance due to the overhead of initializing the EDMA engine. For more info about TI EDMA engine, please refer to the EDMA3 section.

General Purpose Messaging/Interrupt From DSP to Host

A general purpose message/interrupt can be sent/generated from the DSP to the Host using the TI PCIe module. The interrupts will be mapped within the Host OS and will be redirected to SI C667x driver.

The function WaitInterruptFromDSP(ULONG *msg, int waitMilliSecs, unsigned int src) is a blocking call that can be used to force the Host to wait for DSP generated interrupts. It will continue to block Host program execution until the function expires, or a new DSP generated interrupt to Host occurs. There are two different sources of general purpose interrupts, 0 and 1. The message associated with the interrupts will be returned with the passed in msg argument.

To generate the interrupts on the DSP side, write a binary '1' to the EP_IRQ_SET register within the TI PCIe module. There are four GPRs (General Purpose registers) on the TI PCIe. GPR0 and GPR1 are reserved for SI DMA operations, while GPR2 and GPR3 are used to pass a message to the Host side. GPR2 is associated with general purpose interrupt 0 and GPR3 is associated with general purpose interrupt 1. Data should be written into the GPR register before setting the EP_IRQ_SET register.

For the SI-C667xDSP board the base address for the TI PCIe registers is 0x2180_0000, the offset for EP_IRQ_SET is 0x64; the offset for GPR2 is 0x78. Reference the sample code below to trigger the interrupt and send the msg 0xBEEF to the Host.
(*(volatile uint32_t*)(0x21800078)) = 0xbeef;
(*(volatile uint32_t*)(0x21800064)) = 1;

For other SI-C667xDSP boards, please refer to their specific TI documents for the corresponding address.
For more details on the TI PCIe module and offsets of all its registers, please refer to the PCIe section.


DSP

The SI-C667x product line is based on Texas Instruments Keystone Hardware and Software modules. TI's documents and resources are highly recommended for users to fully understand the architecture and utilization of the SI-C667xDSP board. Below is a breakdown of C667x key features and links to learn how to use them.


Software

CCS (Code Composer Studio)

Below is a link to TI's CCS:

CCS is TI's graphical IDE based on Eclipse which contains the necessary tool chains to develop and compile embedded DSP software.
TI CCS tutorial:

SYS/BIOS

SYS/BIOS is a TI real-time operating system running on TI DSPs, and included with CCS. In previous versions it was called DSP/BIOS or sometimes referred as the TI RTOS. As a split kernel, each core would load and run individual binaries created from their own SYS/BIOS projects.

For more details on how to develop binaries from SYS/BIOS projects, please refer to SYS/BIOS User's Guide:

Threads (HWI, SWI, Task and Idle)

These are different thread types on SYS/BIOS, with their main differences outlined below:

  • HWI has the highest priority and is associated with hardware interrupts; can preempt SWI, Task and idle, and cannot be preempted by other threads.
  • SWI has second highest priority and is associated with software interrupts; can preempt Task and idle, and won't be preempted by Task and idle.
  • Task has second lowest priority and is not associated with interrupts; will be scheduled by SYS/BIOS when there is no active HWI or SWI running. Task threads also have peer priorities.
  • Idle has the lowest priority, would only run when there is no active HWI, SWI, or Task.

For more programming details, please refer to the SYS/BIOS User's Guide:

IPC (Inter-Processor Communication)

TI provides a software IPC module with many useful submodules that allow communication and data exchange between different cores or threads.

For more details on the TI IPC, please refer to IPC User's Guide:

SI provides a TI IPC Demo project which allows the user to interact with the IPC MessageQ module through a Host side GUI.
The TI IPC module heavily relies on shared memory with the C667x. Due to the all the cores sharing a single pipeline to the shared memory, the TI IPC module may not be suitable to exchange information between cores in highly demanding applications. As such, SI provides a simplified lightweight IPC module with lower latency than the TI IPC module.

For more information on the SI IPC, please refer to:

For using the SI IPC, please refer to:


TI NDK

The TI-RTOS Networking or TI NDK (Network Development Kit) combines dual mode IPv4/IPv6 stack with some network applications, and is not included with CCS. Therefore, the TI NDK must be separately downloaded; for more details and documentation, please refer to the TI web page:

Please note that NDK interfaces NETCP directly and does not utilize the PA; which is to say that the hardware stack will filter some of the incoming packets while the NDK software stack will filter out rest of the packets. The PA provides better performance but requires a deeper learning curve. As such if used and properly configured, the PA will increase performance since unnecessary packets will be filtered out on the hardware level. However, the NDK provides a simpler interface allowing the user to easily port Host side socket code to the DSP side.

According to TI:
TI provides both NDK and PA LLD. The application can either use NDK or PA LLD with a network stack provided by the customer. If you choose to use the NDK, your application should interface with the NDK only, i.e. invoking NDK APIs for all data traffic. If you choose to use the PA LLD, you need to write your own network stack to interface with the PA LLD and other low layer software stacks such as CPPI and QMSS LLDs.

TI & SI PDK

TI PDK (Platform Development Kit) is a package that provides the foundational drivers and software to enable the device, and is included with CCS. From a software development perspective, the SI-C667xDSP board has differences with the TI EVM and therefore SI provides a customized TI PDK package for its users. To use the SI PDK and create projects with CCS, the following steps must be performed:

  1. Copy everything inside DSPApplications/pdk folder into $ti_installtion\pdk_C6678_1_1_2_6\packages\ folder (the path is usually C:\ti\pdk_C6678_1_1_2_6\packages\).
  2. Open CCS and click 'File Tab->Import'.
  3. Choose 'Existing CCS Eclipse Projects', then 'Next'.
  4. Select 'Search-Directory', and browse to the path where the project is copied into.
  5. Select 'platform_lib_evmc6678l'; do NOT select 'Copy Project Into Workspace'.
  6. Select 'Finish'.
  7. Build the project under the 'Release Configuration'.
  8. From here on out and in order to build a SYS/BIOS project for the SI-C667xDSP board, right click the project name inside CCS and select 'Properties'.
  9. Go to the 'RTSC' tab.
  10. In the 'Platform' drop-down menu, select 'ti.platforms.evm6678'.
  11. Once the platform is selected, initialization should be done on the DSP side via the EVM_Init function before running any DSP programs. Please refer to the EVM_init function inside the Sample_Ethernet project for the platform initialization code. Please note that platform initialization should only be run once in applications where programs may be running in multiple cores. The initialization code should therefore be first called in one core and then the other subsequent cores should wait for the initialization to finish execution before proceeding to BIOS_start().

TI Software Libraries

TI provides some really useful C6x software libraries for use on the C667x. The Mathlib in particular is a well optimized and useful library. Please refer to the TI web page:

TI MathLib

The TI DSP Math Library for Floating Point Devices is an optimized floating-point math function library, and is not included with CCS. Therefore, the TI MathLib must be separately downloaded; for more details and documentation, please refer to the TI web page: http://www.ti.com/tool/mathlib

According to TI:
These routines are typically used in computationally intensive real-time applications where optimal execution speed is critical. By using these routines instead of the routines found in the existing run-time-support libraries, you can achieve execution speeds considerably faster without rewriting existing code. The MATHLIB library includes all the floating-point math routines that are currently provided in the existing run-time-support libraries. These new functions can be called with the current run-time-support library names or the new names included in the math library.

Mathlib Test Reports
TI provides Test Reports that are very useful in obtaining the number of required cycles to execute an overall algorithm and to determine points within the algorithm to break it up across multiple DSP cores. It's profiled for the RTS, Asm, C, Inline, and Vector. These are differently coded versions of the same library. Each one is already compiled and has a distinct library file one can can link to. The main difference besides the performance will be with incorruptibility.

The MATHLIB test report:


SI DSP Demos

SI provides several DSP demo projects, and are designed to work in conjunction with their corresponding Host side demo projects.

Sample_DSP_C667x
This project contains a simple heartbeat loop and some basic arithmetic commands. To use this project, open HostExampleApp and use option '1' to load the precompiled 'sample.out' binary file to the DSP, then use option 'm' for the interactive display.

TI_IPC_Demo
This project is the DSP side code to demonstrate the use of the TI IPC MessageQ module. It is loaded by the Host side Demos/IPCQt project.

Sample_Ethernet
This project contains a simple TCP echo server running on the DSP side. It demonstrates initialization and utilization of the TI Keystone Ethernet interface. Please note that the SI PDK is required along with any Ethernet projects that run on the SI-C667xDSP board; refer to the PDK section for additional details. The workflow for this project is listed below:

  1. EVM_init is called. It initializes the hardware platform and writes correct configuration values into key hardware registers, such as the PLL and Ethernet. Refer to the pdk section for more details about this phase.
  2. main is called. It only serves to call the BIOS_start function.
  3. TaskMain is called. After BIOS_start is called, the SYS/BIOS will start its thread scheduler. TaskMain is called as it's the only registered thread in this project. The first half of TaskMain provides a sample configuration of the Ethernet interface hardware modules, and includes the QMSS, CPPI and PA. For more details on these aforementioned modules, please refer to the Ethernet section. The second half of TaskMain provides a sample configuration of TI NDK stack, and includes setting up the DHCP and the recv/send buffer limit for TCP/UDP. At the end of the function, NC_NetStart will be called, and the NDK will start processing all the configurations.
  4. After configuration is processed and an actual IP is bonded with the program, NetworkIPAddr will be called. Inside this function, TaskCreate is called to start running the TCP echo server thread. Please note TaskCreate is a wrapper function from NDK and is equivalent to creating a task with SYS/BIOS.
  5. ServerThread is called. The TCP echo server code will be running.