Stream Stock Market Data from a CSV file Using Node.js

This lab demonstrates how to use a Node.js Kafka producer to stream data from a CSV file into a Redpanda topic. The script simulates real-time stock market activity by pushing JSON formatted messages into a topic.

{"Date":"10/22/2013","Close/Last":"$40.45","Volume":"8347540","Open":"$39.95","High":"$40.54","Low":"$39.80"}

This script allows you to loop through data continuously, reverse the order of data for different viewing perspectives, and manipulate date columns for time-series analysis.

In this lab, you will:

  • Run the producer that streams data from a CSV file directly into a Redpanda topic.

  • Discover methods to alter the data stream, such as reversing the data sequence or looping through the data continuously for persistent simulations.

  • Adjust date fields dynamically to represent different time frames for analysis.

Prerequisites

Before running the lab, ensure you have the following installed on your host machine:

Run the lab

  1. Clone this repository:

    git clone https://github.com/redpanda-data/redpanda-labs.git
  2. Change into the clients/stock-market-activity/nodejs/ directory:

    cd redpanda-labs/clients/stock-market-activity/nodejs
  3. Install the required dependencies:

    npm i
  4. Set the REDPANDA_VERSION environment variable to the version of Redpanda that you want to run. For all available versions, see the GitHub releases.

    For example:

    export REDPANDA_VERSION=24.1.8
  5. Set the REDPANDA_CONSOLE_VERSION environment variable to the version of Redpanda Console that you want to run. For all available versions, see the GitHub releases.

    For example:

    export REDPANDA_CONSOLE_VERSION=2.6.0
  6. Start a local Redpanda cluster:

    docker compose -f ../../../docker-compose/single-broker/docker-compose.yml up -d
  7. Start the producer:

    node producer.js --brokers localhost:19092

    You should see the messages that the producer is sending to Redpanda:

    Produced: {"Date":"10/22/2013","Close/Last":"$40.45","Volume":"8347540","Open":"$39.95","High":"$40.54","Low":"$39.80"}
  8. Press Ctrl+C to stop the script.

  9. Open Redpanda Console at localhost:8080.

The producer sent the stock market data in the CSV file to the market_activity topic in Redpanda.

Options

The script supports several command-line options to control its behavior:

node producer.js [options]
Option Description

-h, --help

Display the help message and exit.

-f, --file, --csv

Specify the path to the CSV file to be processed. Defaults to ../data/market_activity.csv.

-t, --topic

Specify the topic to which events will be published. Defaults to the name of the CSV file (without its extension).

-b, --broker, --brokers

Comma-separated list of the host and port for each Redpanda broker. Defaults to localhost:9092.

-d, --date

Specify the column in the CSV file that contains date information. By default, the script converts these dates to ISO 8601 format. If the looping option (-l) is enabled, the script will increment each date by one day for each iteration of the loop, allowing for dynamic time series simulation.

-r, --reverse

Read the file into memory and reverse the order of data before sending it to Redpanda. When used with the -l option, data is reversed only once before the looping starts, not during each loop iteration.

-l, --loop

Continuously loop through the file, reading it into memory and sending data to Redpanda in a loop. When combined with the -d option, it modifies the specified date column by incrementally increasing the date with each loop iteration, simulating real-time data flow over days. When used with -r, the data order is reversed initially, and then the loop continues with the reversed data set.

Clean up

To shut down and delete the containers along with all your cluster data:

docker compose -f ../../../docker-compose/single-broker/docker-compose.yml down -v