Server

The server code is located in the server/ directory. It is implemented in Rust using the Rocket framework.

Routes

All server endpoints, also referred to as routes, are implemented in the routes/ directory. The Jobs Feed server follows the REST architectural style. Routes are split into separate files by entity type for better organization and maintainability.

Entities

To simplify storing entities such as sources, job postings, or filters in a database, Jobs Feed uses SeaORM. Each entity is stored in a specific table; for example, source entities are stored in the PostgreSQL table named sources. SeaORM generates a Rust struct for each entity type, enabling easy deletion, modification, and insertion into the database.

The generated entity structs are located in the entity/ directory. These files are generated by SeaORM and should not be modified manually, as any changes will be overwritten. Updating these models requires changes to the database schema and SeaORM migration.

Connecting to the Database

To allow parallel access to the database, a custom database connection pool for SeaORM is implemented in pool.rs. Using State for sharing the database connection results in sequential execution of workers, which is less efficient.

Job Extraction

To extract job postings from source pages, Jobs Feed uses a headless browser for each source URL. This ensures that pages heavily reliant on JavaScript are rendered correctly for content extraction.

Jobs Feed extracts the raw content of these source pages. The extracted content is cached after each refresh run and used to create a diff, ensuring that only new or changed content is used for extracting job postings. This content is then sent to the OpenAI API along with the configured filter information to extract relevant job posting titles and descriptions. To handle large source content that exceeds the context window size, the content is split into smaller messages up to a configured maximum size.

The resulting job titles are then used to extract additional information for each posting. Jobs Feed searches the source content for these titles and performs click actions on the HTML elements containing them. In some cases, this will open a new URL or window with more job posting details, which are then stored in Jobs Feed.

Job Recommendations

Users can rate job postings, and these ratings help highlight similar job postings that users may like and filter out those they dislike. To determine the similarity of job postings, the content and titles are used to create embeddings with the OpenAI embeddings API. Each posting is assigned an embedding vector, and the cosine similarity between job postings is computed. For each extracted job posting, a similarity score to a set of "liked" and "disliked" postings is computed to determine whether the posting might be a good match.

Server

Routes​

Entities​

Connecting to the Database​

Job Extraction​

Job Recommendations​

Routes

Entities

Connecting to the Database

Job Extraction

Job Recommendations