Skip to content

anora10/flink-rss-connector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flink RSS Connector

Flink RSS Connector is a Flink connector which facilitates the processing of RSS feeds as data streams.

Usage

Example query

CREATE TEMPORARY TABLE news (
    `title` STRING,
    `description` STRING
) WITH (
    'connector' = 'rss-connector',
    'uri' = 'https://rss.nytimes.com/services/xml/rss/nyt/Europe.xml,'
            'https://rss.nytimes.com/services/xml/rss/nyt/US.xml,'
            'https://rss.nytimes.com/services/xml/rss/nyt/Africa.xml',
    'refresh-interval' = '5000',
    'format' = 'xml'
);

SELECT * FROM news;

Usage remarks

The rss-connector connector is always to be used with xml format.

Duplicate rows are filtered with a Bloom filter, i.e. the table rows are unique with a high probability.

The refresh-interval is an optional attribute that specifies the query interval in ms. It is 10 minutes by default.

The given fields should be named after the required RSS XML tag with a string type. Field names not being present will be filled with empty strings.

Multiple URIs may be added to the query, separated by commas.

Some examples of RSS news feeds:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages