Skip to content

HOWTO dev streaming_in_rails

steveoro edited this page Jan 16, 2021 · 1 revision

HOWTO: Dev: Streaming in Rails

References:

Overview:

There are 3 possible approaches:

  1. Bi-directional streaming with a Websocket-like connection.
  2. Mono-directional streaming using Server-Events protocol.
  3. HTTP 1.1+ response streaming (still mono-directional) when sending large files or a large response body (with proper headers).

The webserver has to fully support the streaming of the response in order for it to work.

WebBrick as a server is known to not work because it wants to pre-cache all the response body regardless of the prepared headers.

Case 1: Bi-directional streaming

The Websocket protocol allows a continuous bi-directional connection to/from the server which can also be used for streaming data in between.

ActionCable is part of the Rails stack since version 5.0, easing development of Websocket-based cable connections.

The up-to-date better-working versions can be found using Rails 5.1 or above.

ActionCable connections are long-lived, persistent and allow for a large number of connecting clients, each one connected to unlimited "channels" (like individual chat rooms) that can be identified uniquely.


Case 2: Mono-directional streaming

The old Server-Events protocol, conversely, allows only mono-directional streaming from server to client only.

Although useful for pop-up notifications & streaming large files, may not be what you're looking for, especially if you just have to send a huge response body that has to be "chunked" or paginated or split & streamed.

To enable properly Server-Events in a Rails controller, include ActionController::Live in the streaming controller and set the headers for the proper content type to 'event-stream'.

Then, use the response.stream method to write the output, as long the it can be yielded one row at a time.

class StreamingController < ActionController::Base
  include `ActionController::Live`

  def download_with_stream_events
    response.headers['Content-Type'] = 'text/event-stream'
    10.times { |i| response.stream.write("This is row #{i}.\n") }
  ensure
    response.stream.close
  end
end

Case 3: HTTP 1.1-compatible response streaming

By editing the response.headers you can force your webserver to stream the response as long as you're comfortable with setting the response length to zero and editing other headers as well, in order to avoid caching in the server and its proxy.

class StreamingController < ActionController::Base
  def download
    # Tell Rack to stream the content
    headers.delete("Content-Length")
    # Don't cache anything from this generated endpoint
    headers["Cache-Control"] = "no-cache"
    # Tell the browser this is a CSV file
    headers["Content-Type"] = "text/csv"
    # Make the file download with a specific filename
    headers["Content-Disposition"] = "attachment; filename=\"test_data.csv\""
    # Don't buffer when going through proxy servers
    headers["X-Accel-Buffering"] = "no"

    # Set an Enumerator as the body so that we may process 1 row at a time
    self.response_body = Enumerator.new do |yielder|
      1000.times do |index|
        yielder << [index, "This is row #{i}!"].to_csv
      end
    end

    # Set the status to success
    response.status = 200
  end
end

The most important line here is the one with self.response_body = Enumerator.new.

By setting the response body directly to an enumerator (rather than letting Rails set it implicitly via rendering a template or something), Rails will use the enumerator to send the data element by element, calling next on the enumerator to get the next chunk of data.

In an actual use-case though, to generate a CSV file you'll probably still need to go to the DB to get the data, which can also be painfully slow in some cases.

If you are making a large database call to start your CSV generation, then you can still use Rails and Enumerators to help speed up the start of the database load, and therefore the start of your CSV data stream.

In this case, use the find_each method to return batches of rows from the query (defaults to 1000 rows) and apply also a lazy enumerator on it in order to process one row at a time.

def lazy_build_csv_collection
  MyModel.find_each.lazy.map do |model|
    model.to_row
  end
end
Clone this wiki locally