crabbysearch/src/results/aggregator.rs

388 lines
14 KiB
Rust
Raw Normal View History

2023-04-27 17:53:28 +03:00
//! This module provides the functionality to scrape and gathers all the results from the upstream
//! search engines and then removes duplicate results.
use super::user_agent::random_user_agent;
use crate::config::parser::Config;
use crate::handler::{file_path, FileType};
use crate::models::{
aggregation_models::{EngineErrorInfo, SearchResult, SearchResults},
engine_models::{EngineError, EngineHandler},
};
use error_stack::Report;
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
use futures::stream::FuturesUnordered;
use regex::Regex;
use reqwest::{Client, ClientBuilder};
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
use std::sync::Arc;
use std::time::{SystemTime, UNIX_EPOCH};
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
use tokio::{
fs::File,
io::{AsyncBufReadExt, BufReader},
task::JoinHandle,
time::Duration,
};
2023-04-22 14:35:07 +03:00
/// A constant for holding the prebuilt Client globally in the app.
static CLIENT: std::sync::OnceLock<Client> = std::sync::OnceLock::new();
2023-07-15 13:36:46 +03:00
/// Aliases for long type annotations
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
type FutureVec =
FuturesUnordered<JoinHandle<Result<Vec<(String, SearchResult)>, Report<EngineError>>>>;
/// The function aggregates the scraped results from the user-selected upstream search engines.
/// These engines can be chosen either from the user interface (UI) or from the configuration file.
/// The code handles this process by matching the selected search engines and adding them to a vector.
/// This vector is then used to create an asynchronous task vector using `tokio::spawn`, which returns
/// a future. This future is awaited in another loop. Once the results are collected, they are filtered
/// to remove any errors and ensure only proper results are included. If an error is encountered, it is
/// sent to the UI along with the name of the engine and the type of error. This information is finally
/// placed in the returned `SearchResults` struct.
///
/// Additionally, the function eliminates duplicate results. If two results are identified as coming from
/// multiple engines, their names are combined to indicate that the results were fetched from these upstream
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
/// engines. After this, all the data in the `Vec` is removed and placed into a struct that contains all
/// the aggregated results in a vector. Furthermore, the query used is also added to the struct. This step is
/// necessary to ensure that the search bar in the search remains populated even when searched from the query URL.
///
/// Overall, this function serves to aggregate scraped results from user-selected search engines, handling errors,
/// removing duplicates, and organizing the data for display in the UI.
2023-04-27 17:53:28 +03:00
///
/// # Example:
///
/// If you search from the url like `https://127.0.0.1/search?q=huston` then the search bar should
/// contain the word huston and not remain empty.
///
2023-04-27 17:53:28 +03:00
/// # Arguments
///
/// * `query` - Accepts a string to query with the above upstream search engines.
/// * `page` - Accepts an u32 page number.
/// * `random_delay` - Accepts a boolean value to add a random delay before making the request.
2023-07-15 13:36:46 +03:00
/// * `debug` - Accepts a boolean value to enable or disable debug mode option.
/// * `upstream_search_engines` - Accepts a vector of search engine names which was selected by the
2023-07-30 17:08:47 +03:00
/// * `request_timeout` - Accepts a time (secs) as a value which controls the server request timeout.
2023-07-15 13:36:46 +03:00
/// user through the UI or the config file.
2023-04-27 17:53:28 +03:00
///
/// # Error
///
/// Returns an error a reqwest and scraping selector errors if any error occurs in the results
2023-04-27 17:53:28 +03:00
/// function in either `searx` or `duckduckgo` or both otherwise returns a `SearchResults struct`
/// containing appropriate values.
2023-04-22 14:35:07 +03:00
pub async fn aggregate(
query: &str,
page: u32,
config: &Config,
upstream_search_engines: &[EngineHandler],
safe_search: u8,
2023-04-22 14:35:07 +03:00
) -> Result<SearchResults, Box<dyn std::error::Error>> {
let client = CLIENT.get_or_init(|| {
ClientBuilder::new()
.timeout(Duration::from_secs(config.request_timeout as u64)) // Add timeout to request to avoid DDOSing the server
.connect_timeout(Duration::from_secs(config.request_timeout as u64)) // Add timeout to request to avoid DDOSing the server
.https_only(true)
.gzip(true)
.brotli(true)
.http2_adaptive_window(config.adaptive_window)
.build()
.unwrap()
});
let user_agent: &str = random_user_agent();
2023-04-22 14:35:07 +03:00
// Add a random delay before making the request.
if config.aggregator.random_delay || !config.debug {
let nanos = SystemTime::now().duration_since(UNIX_EPOCH)?.subsec_nanos() as f32;
let delay = ((nanos / 1_0000_0000 as f32).floor() as u64) + 1;
tokio::time::sleep(Duration::from_secs(delay)).await;
}
let mut names: Vec<&str> = Vec::with_capacity(0);
// create tasks for upstream result fetching
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let tasks: FutureVec = FutureVec::new();
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let query: Arc<String> = Arc::new(query.to_string());
for engine_handler in upstream_search_engines {
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let (name, search_engine) = engine_handler.clone().into_name_engine();
names.push(name);
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let query_partially_cloned = query.clone();
tasks.push(tokio::spawn(async move {
search_engine
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
.results(
&query_partially_cloned,
page,
user_agent,
client,
safe_search,
)
.await
}));
}
2023-06-15 06:27:45 +08:00
// get upstream responses
let mut responses = Vec::with_capacity(tasks.len());
2023-04-22 14:35:07 +03:00
for task in tasks {
if let Ok(result) = task.await {
responses.push(result)
}
}
2023-04-22 14:35:07 +03:00
// aggregate search results, removing duplicates and handling errors the upstream engines returned
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let mut result_map: Vec<(String, SearchResult)> = Vec::new();
let mut engine_errors_info: Vec<EngineErrorInfo> = Vec::new();
let mut handle_error = |error: &Report<EngineError>, engine_name: &'static str| {
log::error!("Engine Error: {:?}", error);
engine_errors_info.push(EngineErrorInfo::new(
error.downcast_ref::<EngineError>().unwrap(),
engine_name,
));
};
for _ in 0..responses.len() {
let response = responses.pop().unwrap();
let engine = names.pop().unwrap();
if result_map.is_empty() {
match response {
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
Ok(results) => result_map = results,
Err(error) => handle_error(&error, engine),
};
continue;
}
match response {
Ok(result) => {
result.into_iter().for_each(|(key, value)| {
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
match result_map.iter().find(|(key_s, _)| key_s == &key) {
Some(value) => value.1.to_owned().add_engines(engine),
None => result_map.push((key, value)),
};
});
}
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
Err(error) => handle_error(&error, engine),
};
}
if safe_search >= 3 {
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let mut blacklist_map: Vec<(String, SearchResult)> = Vec::new();
filter_with_lists(
&mut result_map,
&mut blacklist_map,
2023-09-10 18:56:54 +03:00
file_path(FileType::BlockList)?,
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
)
.await?;
filter_with_lists(
&mut blacklist_map,
&mut result_map,
2023-09-10 18:56:54 +03:00
file_path(FileType::AllowList)?,
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
)
.await?;
drop(blacklist_map);
}
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let results: Vec<SearchResult> = result_map.iter().map(|(_, value)| value.clone()).collect();
2023-04-22 14:35:07 +03:00
Ok(SearchResults::new(results, &engine_errors_info))
2023-04-22 14:35:07 +03:00
}
/// Filters a map of search results using a list of regex patterns.
///
/// # Arguments
///
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
/// * `map_to_be_filtered` - A mutable reference to a `Vec` of search results to filter, where the filtered results will be removed from.
/// * `resultant_map` - A mutable reference to a `Vec` to hold the filtered results.
/// * `file_path` - A `&str` representing the path to a file containing regex patterns to use for filtering.
///
/// # Errors
///
/// Returns an error if the file at `file_path` cannot be opened or read, or if a regex pattern is invalid.
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
pub async fn filter_with_lists(
map_to_be_filtered: &mut Vec<(String, SearchResult)>,
resultant_map: &mut Vec<(String, SearchResult)>,
file_path: &str,
) -> Result<(), Box<dyn std::error::Error>> {
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let reader = BufReader::new(File::open(file_path).await?);
let mut lines = reader.lines();
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
while let Some(line) = lines.next_line().await? {
let re = Regex::new(line.trim())?;
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let mut length = map_to_be_filtered.len();
let mut idx: usize = Default::default();
// Iterate over each search result in the map and check if it matches the regex pattern
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
while idx < length {
let ele = &map_to_be_filtered[idx];
let ele_inner = &ele.1;
match re.is_match(&ele.0.to_lowercase())
|| re.is_match(&ele_inner.title.to_lowercase())
|| re.is_match(&ele_inner.description.to_lowercase())
{
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
true => {
// If the search result matches the regex pattern, move it from the original map to the resultant map
resultant_map.push(map_to_be_filtered.swap_remove(idx));
length -= 1;
}
false => idx += 1,
};
}
}
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
use smallvec::smallvec;
use std::io::Write;
use tempfile::NamedTempFile;
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
#[tokio::test]
async fn test_filter_with_lists() -> Result<(), Box<dyn std::error::Error>> {
// Create a map of search results to filter
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let mut map_to_be_filtered = Vec::new();
map_to_be_filtered.push((
"https://www.example.com".to_owned(),
SearchResult {
title: "Example Domain".to_owned(),
url: "https://www.example.com".to_owned(),
2023-08-24 09:50:19 +08:00
description: "This domain is for use in illustrative examples in documents."
.to_owned(),
engine: smallvec!["Google".to_owned(), "Bing".to_owned()],
},
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
));
map_to_be_filtered.push((
"https://www.rust-lang.org/".to_owned(),
SearchResult {
title: "Rust Programming Language".to_owned(),
url: "https://www.rust-lang.org/".to_owned(),
description: "A systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.".to_owned(),
engine: smallvec!["Google".to_owned(), "DuckDuckGo".to_owned()],
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
},)
);
// Create a temporary file with regex patterns
let mut file = NamedTempFile::new()?;
writeln!(file, "example")?;
writeln!(file, "rust")?;
file.flush()?;
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let mut resultant_map = Vec::new();
2023-08-24 09:50:19 +08:00
filter_with_lists(
&mut map_to_be_filtered,
&mut resultant_map,
file.path().to_str().unwrap(),
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
)
.await?;
assert_eq!(resultant_map.len(), 2);
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
assert!(resultant_map
.iter()
.any(|(key, _)| key == "https://www.example.com"));
assert!(resultant_map
.iter()
.any(|(key, _)| key == "https://www.rust-lang.org/"));
assert_eq!(map_to_be_filtered.len(), 0);
Ok(())
}
2023-08-24 09:32:22 +08:00
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
#[tokio::test]
async fn test_filter_with_lists_wildcard() -> Result<(), Box<dyn std::error::Error>> {
let mut map_to_be_filtered = Vec::new();
map_to_be_filtered.push((
"https://www.example.com".to_owned(),
SearchResult {
title: "Example Domain".to_owned(),
url: "https://www.example.com".to_owned(),
2023-08-24 09:50:19 +08:00
description: "This domain is for use in illustrative examples in documents."
.to_owned(),
engine: smallvec!["Google".to_owned(), "Bing".to_owned()],
},
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
));
map_to_be_filtered.push((
"https://www.rust-lang.org/".to_owned(),
SearchResult {
title: "Rust Programming Language".to_owned(),
url: "https://www.rust-lang.org/".to_owned(),
description: "A systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.".to_owned(),
engine: smallvec!["Google".to_owned(), "DuckDuckGo".to_owned()],
},
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
));
// Create a temporary file with a regex pattern containing a wildcard
let mut file = NamedTempFile::new()?;
writeln!(file, "ex.*le")?;
file.flush()?;
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let mut resultant_map = Vec::new();
2023-08-24 09:50:19 +08:00
filter_with_lists(
&mut map_to_be_filtered,
&mut resultant_map,
file.path().to_str().unwrap(),
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
)
.await?;
assert_eq!(resultant_map.len(), 1);
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
assert!(resultant_map
.iter()
.any(|(key, _)| key == "https://www.example.com"));
assert_eq!(map_to_be_filtered.len(), 1);
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
assert!(map_to_be_filtered
.iter()
.any(|(key, _)| key == "https://www.rust-lang.org/"));
Ok(())
}
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
#[tokio::test]
async fn test_filter_with_lists_file_not_found() {
let mut map_to_be_filtered = Vec::new();
2023-08-24 09:32:22 +08:00
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let mut resultant_map = Vec::new();
2023-08-24 09:32:22 +08:00
// Call the `filter_with_lists` function with a non-existent file path
2023-08-24 09:50:19 +08:00
let result = filter_with_lists(
&mut map_to_be_filtered,
&mut resultant_map,
"non-existent-file.txt",
);
2023-08-24 09:32:22 +08:00
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
assert!(result.await.is_err());
2023-08-24 09:32:22 +08:00
}
2023-08-24 09:36:08 +08:00
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
#[tokio::test]
async fn test_filter_with_lists_invalid_regex() {
let mut map_to_be_filtered = Vec::new();
map_to_be_filtered.push((
"https://www.example.com".to_owned(),
2023-08-24 09:36:08 +08:00
SearchResult {
title: "Example Domain".to_owned(),
url: "https://www.example.com".to_owned(),
2023-08-24 09:50:19 +08:00
description: "This domain is for use in illustrative examples in documents."
.to_owned(),
engine: smallvec!["Google".to_owned(), "Bing".to_owned()],
2023-08-24 09:36:08 +08:00
},
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
));
2023-08-24 09:36:08 +08:00
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
let mut resultant_map = Vec::new();
2023-08-24 09:36:08 +08:00
// Create a temporary file with an invalid regex pattern
let mut file = NamedTempFile::new().unwrap();
writeln!(file, "example(").unwrap();
file.flush().unwrap();
2023-08-24 09:50:19 +08:00
let result = filter_with_lists(
&mut map_to_be_filtered,
&mut resultant_map,
file.path().to_str().unwrap(),
);
2023-08-24 09:36:08 +08:00
:zap: perf: several optimizations for improving the performance of the engine (#540) * :recycle: refactor: initialize & store the config & cache structs as a constant (#486) - initializes & stores the config & cache structs as a static constant. - Pass the config & cache structs as a static reference to all the functions handling their respective route. * :zap: perf: replace hashmaps with vectors for fetching & aggregating results (#486) - replace hashmaps with vectors for fetching, collecting & aggregating results as it tends to be contigous & cache efficient data structure. - refactor & redesign algorithms for fetching & aggregating results centered around vectors in aggregate function. * :heavy_plus_sign: build: add the future crate (#486) * :zap: perf: use `futureunordered` for collecting results fetched from the tokio spawn tasks (#486) - using the `futureunordered` instead of vector for collecting results reduces the time it takes to fetch the results as the results do not need to come in specific order so any result that gets fetched first gets collected in the `futureunordered` type. Co-authored-by: Spencerjibz <spencernajib2@gmail.com> * :zap: perf: initialize new async connections parallely using tokio spawn tasks (#486) * :zap: perf: initialize redis pipeline struct once with the default size of 3 (#486) * :zap: perf: reduce branch predictions by reducing conditional code branches (#486) * :white_check_mark: test(unit): provide unit test for the `get_safesearch_level` function (#486) * :zap: perf: reduce clones & use index based loop to improve search results filtering performance (#486) * 🚨 fix(clippy): make clippy/format checks happy (#486) * 🚨 fix(build): make the cargo build check happy (#486) * :zap: perf: reduce the amount of clones, to_owneds & to_strings (#486) * :zap: perf: use async crates & methods & make functions async (#486) * :bookmark: chore(release): bump the app version (#486) --------- Co-authored-by: Spencerjibz <spencernajib2@gmail.com>
2024-03-11 12:01:30 +03:00
assert!(result.await.is_err());
2023-08-24 09:50:19 +08:00
}
2023-08-24 09:36:08 +08:00
}