Mist is a service for exposing analytics jobs and machine learning models as web services.
Mist provides an API for Scala & Python Apache Spark jobs and for machine learning models.
It implements Spark as a Service and creates a unified API layer for building enterprise solutions and services on top of a Big Data lake.
Discover more Hydrosphere Mist use cases.
Table of Contents
- Realtime low latency models serving/scoring
- Exposing Apache Spark jobs through REST API
- Spark 2.0.0 support!
- Spark Contexts orchestration
- Super parallel mode: multiple Spark contexts in separate JVMs or Dockers
- HTTP & Messaging (MQTT) API
- Scala and Python Spark jobs support
- Support for Spark SQL and Hive
- High Availability and Fault Tolerance
- Self Healing after driver program failure
- Powerful logging
- Clear end-user API
######Dependencies
- jdk = 8
- spark >= 1.5.2 (earlier versions were not tested)
- MQTT Server (optional)
######Run mist
docker run -p 2003:2003 -v /var/run/docker.sock:/var/run/docker.sock -d hydrosphere/mist:master-2.0.0 mist
######Run example
sbt "project examples" package
curl --header "Content-Type: application/json" -X POST http://localhost:2003/api/simple-context --data '{"digits": [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]}'
Check out Complete Getting Started Guide
- Build the project
git clone https://github.com/hydrospheredata/mist.git
cd mist
sbt -DsparkVersion=2.0.0 assembly
- Run
./bin/mist start master
# clone mist repo
git clone https://github.com/Hydrospheredata/mist
# available spark versions: 1.5.2, 1.6.2, 2.0.0
export SPARK_VERSION=2.0.0
docker create --name mist-${SPARK_VERSION} -v /usr/share/mist hydrosphere/mist:tests-${SPARK_VERSION}
docker run --name mosquitto-${SPARK_VERSION} -d ansi/mosquitto
docker run --name hdfs-${SPARK_VERSION} --volumes-from mist-${SPARK_VERSION} -d hydrosphere/hdfs start
# run tests
docker run -v /var/run/docker.sock:/var/run/docker.sock --link mosquitto-${SPARK_VERSION}:mosquitto --link hdfs-${SPARK_VERSION}:hdfs -v $PWD:/usr/share/mist hydrosphere/mist:tests-${SPARK_VERSION} tests
# or run mist
docker run -v /var/run/docker.sock:/var/run/docker.sock --link mosquitto-${SPARK_VERSION}:mosquitto --link hdfs-${SPARK_VERSION}:hdfs -v $PWD:/usr/share/mist hydrosphere/mist:tests-${SPARK_VERSION} mist- Complete Getting Started Guide
- Learn from Use Cases and Tutorials
- Learn about Mist Routers
- Configure mist to make it fast and reliable
| Mist Version | Scala Version | Python Version | Spark Version |
|---|---|---|---|
| 0.1.4 | 2.10.6 | 2.7.6 | >=1.5.2 |
| 0.2.0 | 2.10.6 | 2.7.6 | >=1.5.2 |
| 0.3.0 | 2.10.6 | 2.7.6 | >=1.5.2 |
| 0.4.0 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
| 0.5.0 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
| 0.6.5 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
| 0.7.0 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
| 0.8.0 | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
| master | 2.10.6, 2.11.8 | 2.7.6 | >=1.5.2 |
- Persist job state for self healing
- Super parallel mode: run Spark contexts in separate JVMs
- Powerful logging
- RESTification
- Support streaming contexts/jobs
- Reactive API
- Realtime ML models serving/scoring
- CLI
- Web Interface
- Apache Kafka support
- Bi-directional streaming API
- AMQP support
- Getting Started
- Use Cases & Tutorials
- CLI
- Scala & Python Mist DSL
- REST API
- Streaming API
- Code Examples
- Configuration
- License
- Logging
- Low level API Reference
- Namespaces
- Changelog
- Tests
Please report bugs/problems to: https://github.com/Hydrospheredata/mist/issues.
