1. Modern data systems ( #data-systems ) - Composable data system ecosystem - S3, BCS, BCOS, Iceberg, Hive, Trino, Spark, Superset, Parquet, Arrow, Kafka - Sources: Voltron data, "Designing Data Intensive Applications", CMU Database Group - Projects: ADBC driver, ideation platform iceberg integration 2. software engineering ( #swe ) - writing better software, being more effective - language proficiency: Rust, C++, and Systems Programming - go, ocaml - other engineering skills (testing, core technologies, linux interface, gnu tools, etc.) 3. Computer science ( #cs ) - data structures, algorithms, mathematics, and theory of computing - mit ocw, books 4. Devops ( #devops ) - cloud native computing, Docker, Kubernetes, packaging, telemetry, networking, system design, etc. know how to deploy and maintain software 5. ML/data science ( #ml/ds ) - keep up with modern literature and practices # WIP | Status | Category | Medium | Description | Link | |---------------- | --------------- | --------------- | --------------- | --------------- | | #WIP | #book | #data-systems | In-memory analytics with apache arrow | [[in-memory-analytics-with-apache-arrow]] | | #WIP | #paper | #data-systems | towards a self-tuning risc-style db system | https://www.cs.cmu.edu/~natassa/courses/15-721/papers/P001.pdf | | #WIP | #article | #data-systems | [[voltron-composable-codex]] | https://voltrondata.com/codex | | #WIP | #book | #swe | Rust in action | | | #WIP | #article | #cs | log-structured hash table | https://www.siddharthjain.dev/posts/2020/designing-data-intensive-applications-hash-indexes/ | # Backlog - #rust https://adventures.michaelfbryan.com/posts/daily/slice-patterns/ - #rust https://adventures.michaelfbryan.com/posts/rust-best-practices/bad-habits/ - #cs https://www.cs.utexas.edu/~moore/best-ideas/ - #book #data-systems #swe Designing Data Intensive Applications - #book Crafting Interpreters - #book C++ Crash Course - #book Effective C++ - #book Software Engineering at Google - #book Designing Microservices - #book Compilers - The Dragon Book - #book Michael Abrash's graphics programming black book https://www.jagregory.com/abrash-black-book/ - https://12factor.net/ - https://www.drcathicks.com/post/reducing-the-deficit-in-tech - https://www.drcathicks.com/post/coding-in-the-dark-a-year-later - https://www.drcathicks.com/post/sense-of-belonging-and-software-teams - https://iggyfernandez.wordpress.com/2011/12/23/nocoug-journal-interview-professor-stonebraker/ - https://vickiboykis.com/2022/12/05/the-cloudy-layers-of-modern-day-programming/ - https://rachelbythebay.com/w/2020/08/14/jobs/ - https://data-apis.org/dataframe-api/draft/ - https://roundup.getdbt.com/p/problem-exists-between-database-and - [[dont-hold-my-data-hostage-a-case-for-client-protocol-redesign]] https://www.vldb.org/pvldb/vol10/p1022-muehleisen.pdf - [[infinite-loop-of-sadness-slack-data-engineering]] https://www.youtube.com/watch?v=EtYv7zPyS2A - [[pycon-2018-keynote]] https://www.youtube.com/watch?v=ITksU31c1WY - guido speech https://neopythonic.blogspot.com/2016/04/kings-day-speech.html - [[for-sql]] https://databased.pedramnavid.com/p/for-sql - [[magpie-python-at-speed-and-scale-using-cloud-backends]] https://www.cidrdb.org/cidr2021/papers/cidr2021_paper08.pdf - https://voltrondata.com/resources/open-source-standards - https://voltrondata.com/resources/12-open-source-projects-to-watch-2023 - [[the-ai-hierarchy-of-needs]] https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007 - https://roundup.getdbt.com/p/ep-37-what-does-apache-arrow-unlock - https://vickiboykis.com/2023/09/13/build-and-keep-your-context-window/ - [[composable-data-management-system-manifesto]] https://www.vldb.org/pvldb/vol16/p2679-pedreira.pdf - [[shared-foundations-modernizing-metas-data-lakehouse]] https://research.facebook.com/publications/shared-foundations-modernizing-metas-data-lakehouse/ - [[all-in-on-apache-arrow]] https://blog.streamlit.io/all-in-on-apache-arrow/ - [[emerging-architectures-for-modern-data-infrastructure]] https://a16z.com/emerging-architectures-for-modern-data-infrastructure/ - The Hardware Lottery https://hardwarelottery.github.io/ - CMU db group Modern OLAP Database Systems https://www.youtube.com/watch?v=5J-I8Mj8tss - 1 billion rows in Go https://benhoyt.com/writings/go-1brc/ - How Discord Stores Trillions of Messages https://discord.com/blog/how-discord-stores-trillions-of-messages - How Discord Migrated Trillions of Messages from Cassandra to ScyllaDB - YouTube https://www.youtube.com/watch?v=S2xmFOAUhsk - cloudflare rust foundations library https://blog.cloudflare.com/introducing-foundations-our-open-source-rust-service-foundation-library - https://github.com/cloudflare/foundations - why discord is switching from go to rust https://discord.com/blog/why-discord-is-switching-from-go-to-rust - crust of rust https://www.youtube.com/watch?v=rAl-9HwD858&list=PLqbS7AVVErFiWDOAVrPt7aYmnuuOLYvOa - vim virtual text https://jdhao.github.io/2021/09/09/nvim_use_virtual_text/ - #course #theory mit advanced data structures https://ocw.mit.edu/courses/6-851-advanced-data-structures-spring-2012/video_galleries/lecture-videos/ - #course #theory mit advanced algorithms https://ocw.mit.edu/courses/6-854j-advanced-algorithms-fall-2008/pages/readings/ - #course #theory mit distributed algorithms https://ocw.mit.edu/courses/6-852j-distributed-algorithms-fall-2009/pages/lecture-notes/ - #course #theory mit real analysis https://ocw.mit.edu/courses/18-100a-real-analysis-fall-2020/video_galleries/video-lectures/ - #course #theory mit great ideas in theoretical computer science https://ocw.mit.edu/courses/6-080-great-ideas-in-theoretical-computer-science-spring-2008/pages/lecture-notes/ - #course #theory mit An algorithmist's toolkit https://ocw.mit.edu/courses/18-409-topics-in-theoretical-computer-science-an-algorithmists-toolkit-fall-2009/pages/lecture-notes/ - #talk #data-systems Event-Driven Architectures Done Right, Apache Kafka • Tim Berglund • Devoxx Poland 2021 https://www.youtube.com/watch?v=A_mstzRGfIE - #course cmu hardware accelerated database lectures https://www.youtube.com/playlist?list=PLSE8ODhjZXjbjOyrcqgE6_lCV6xvzffSN - #course cmu advanced database systems https://www.youtube.com/watch?v=LWS8LEQAUVc&list=PLSE8ODhjZXjYzlLMbX3cR0sxWnRM7CLFn - #course cmu database query optimizers https://www.youtube.com/watch?v=pQe1LQJiXN0&list=PLSE8ODhjZXjYPyrUG_YxqYPS7wjWY6gYN - #course cmu intro to db system https://www.youtube.com/watch?v=DJ5u5HrbcMk&list=PLSE8ODhjZXjbj8BMuIrRcacnQh20hmY9g&index=4 - https://db.cs.cmu.edu/ ## Debian packaging - https://www.debian.org/doc/manuals/packaging-tutorial/packaging-tutorial.pdf - https://wiki.archlinux.org/title/Creating_packages_for_other_distributions - https://wiki.debian.org/SimplePackagingTutorial#What_is_Debian_packaging - https://wiki.debian.org/HowToPackageForDebian - https://canonical-ubuntu-packaging-guide.readthedocs-hosted.com/en/latest/ - https://www.debian.org/doc/manuals/developers-reference/index.en.html - https://www.debian.org/doc/manuals/maint-guide/dreq.en.html - https://ubuntuforums.org/showthread.php?t=910717 ## Data Tech - ibis: https://ibis-project.org/why - sourcegraph: https://sourcegraph.com/docs - cody: https://sourcegraph.com/docs/cody/clients/install-neovim - https://voltrondata.com/resources/go-inside-the-arrow-database-connectivity-roadmap-background-and-community - https://medium.com/snowflake/arrow-database-connectivity-adbc-support-for-snowflake-7bfb3a2d9074 - Apache spark https://spark.apache.org/streaming/ - Starurst https://docs.starburst.io/introduction/choose-your-starburst-product.html - Iceberg https://iceberg.apache.org/ - Elasticsearch https://www.elastic.co/elasticsearch - Superset https://www.elastic.co/elasticsearch - Apache Airflow https://airflow.apache.org/ - Trino | Distributed SQL query engine for big data https://trino.io/ - Apache Spark™ - Unified Engine for large-scale data analytics https://spark.apache.org/ - Apache Iceberg - Apache Iceberg https://iceberg.apache.org/ - What is Apache Parquet? https://www.databricks.com/glossary/what-is-parquet#:~:text=What%20is%20Parquet%3F,handle%20complex%20data%20in%20bulk. - Apache Projects List https://projects.apache.org/projects.html?category#big-data - Apache Project Information https://projects.apache.org/project.html?helix - MapReduce - Wikipedia https://en.wikipedia.org/wiki/MapReduce - What is Apache MapReduce? | IBM https://www.ibm.com/topics/mapreduce#:~:text=MapReduce%20is%20a%20programming%20paradigm,tasks%20that%20Hadoop%20programs%20perform. - synchronizing databases - Google Search https://www.google.com/search?q=synchronizing+databases&sca_esv=130c2c2530734d82&hl=en&sxsrf=ACQVn09sBxZ5xKCN1PHV_1_BAjY8FdTJMQ%3A1708155051282&source=hp&ei=q2DQZZngDsCv5NoPotOs4A4&iflsig=ANes7DEAAAAAZdBuu8MPpld9Nn9EysRV704Ycq1nH52G&ved=0ahUKEwjZ4bO47bGEAxXAF1kFHaIpC-wQ4dUDCBc&uact=5&oq=synchronizing+databases&gs_lp=Egdnd3Mtd2l6IhdzeW5jaHJvbml6aW5nIGRhdGFiYXNlczIFEAAYgAQyBhAAGBYYHjIGEAAYFhgeMgYQABgWGB4yBhAAGBYYHjIGEAAYFhgeMgoQABgWGB4YDxgKMgYQABgWGB4yBhAAGBYYHjILEAAYgAQYigUYhgNIpEBQAFj9PnAEeACQAQCYAVugAdMOqgECMje4AQPIAQD4AQGoAgrCAgsQABiABBixAxiDAcICDhAAGIAEGIoFGLEDGIMBwgIIEAAYgAQYsQPCAgcQIxjqAhgnwgIEECMYJ8ICChAjGIAEGIoFGCfCAgoQABiABBiKBRhDwgIREC4YgAQYsQMYgwEYxwEY0QPCAgsQABiABBiKBRiRAsICERAuGIAEGIoFGJECGMcBGNEDwgINEAAYgAQYigUYQxixA8ICCxAuGIAEGLEDGIMBwgIREC4YgwEYxwEYsQMY0QMYgATCAhAQABiABBiKBRiRAhhGGPkBwgIOEC4YgAQYxwEYrwEYjgXCAhAQABiABBiKBRhDGLEDGIMBwgIOEAAYgAQYigUYkQIYsQPCAhYQABiABBiKBRiRAhixAxiDARhGGPkBwgIREAAYgAQYigUYkQIYsQMYgwHCAgsQLhiDARixAxiABMICDRAAGIAEGAoYsQMYgwHCAgcQLhiABBgKwgIHEAAYgAQYCsICChAAGIAEGAoYsQPCAg0QLhiABBgKGMcBGK8BwgINEC4YgAQYChixAxiDAcICDRAuGAoYrwEYxwEYgATCAgcQLhgKGIAEwgIPEAAYgAQYChixAxhGGPkBwgIGEAAYAxgKwgINEAAYgAQYsQMYRhj5AcICBBAAGAPCAgoQABiABBhGGPkBwgIKEAAYgAQYFBiHAsICCBAAGBYYHhgPwgIIEAAYFhgeGAo&sclient=gws-wiz - The Complete Data Synchronization Guide | Veritas https://www.veritas.com/information-center/data-synchronization - csv - Is there a simple way to load parquet files directly into Cassandra? - Stack Overflow https://stackoverflow.com/questions/58709751/is-there-a-simple-way-to-load-parquet-files-directly-into-cassandra - Spark Structured Streaming | Apache Spark https://spark.apache.org/streaming/ - Structured Streaming Programming Guide - Spark 3.5.0 Documentation https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html ### DB Update Events - postgres update events - Google Search https://www.google.com/search?q=postgres+update+events&oq=postgres+update+events&gs_lcrp=EgZjaHJvbWUyBggAEEUYOTIGCAEQLhhA0gEIMzM2OGowajGoAgCwAgA&sourceid=chrome&ie=UTF-8 - PostgreSQL trigger an event on table update - Stack Overflow https://stackoverflow.com/questions/55218826/postgresql-trigger-an-event-on-table-update - cassandra update events - Google Search https://www.google.com/search?q=cassandra+update+events&oq=cassandra+update+events&gs_lcrp=EgZjaHJvbWUyBggAEEUYOTIGCAEQLhhA0gEINDE3NGowajGoAgCwAgA&sourceid=chrome&ie=UTF-8 - java - Is there any mechanism in cassandra that sends notification when table is changed through INSERT or UPDATE query? - Stack Overflow https://stackoverflow.com/questions/38429421/is-there-any-mechanism-in-cassandra-that-sends-notification-when-table-is-change - CREATE TRIGGER https://docs.datastax.com/en/archived/cql/3.1/cql/cql_reference/trigger_r.html - What’s New in Cassandra 2.0: Prototype Triggers Support | Datastax https://www.datastax.com/blog/whats-new-cassandra-20-prototype-triggers-support - markreddy/cassandra-trigger-example: Example Cassandra triggers https://github.com/markreddy/cassandra-trigger-example - java - Listen for Changes In Cassandra Datastore? - Stack Overflow https://stackoverflow.com/questions/29972742/listen-for-changes-in-cassandra-datastore - Change Data Capture | Apache Cassandra Documentation https://cassandra.apache.org/doc/stable/cassandra/operating/cdc.html - Publish events from cassandra to kafka via triggers | by (λx.x)eranga | Effectz.AI | Medium https://medium.com/rahasak/publish-events-from-cassandra-to-kafka-via-cassandra-triggers-59818dcf7eed ## Algos - String Searching - Knuth-Morris-Pratt - Aho-Corasick - Boyer-Moore - Flashtext - Aho–Corasick algorithm - Wikipedia https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm - https://en.wikipedia.org/wiki/Grep - https://en.wikipedia.org/wiki/Regular_expression#Implementations_and_running_times - https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string-search_algorithm - https://onlinelibrary.wiley.com/doi/10.1002/spe.4380211105 - https://www.google.com/search?q=aho+corasick+for+regex&oq=aho+corasick+for+regex&gs_lcrp=EgZjaHJvbWUyBggAEEUYOTIGCAEQLhhA0gEINTA5OGowajGoAgCwAgA&sourceid=chrome&ie=UTF-8 - https://www.freecodecamp.org/news/regex-was-taking-5-days-flashtext-does-it-in-15-minutes-55f04411025f/ - https://stackoverflow.com/questions/44178449/regex-replace-is-taking-time-for-millions-of-documents-how-to-make-it-faster - https://github.com/vi3k6i5/flashtext - https://arxiv.org/pdf/1711.00046.pdf - https://en.wikipedia.org/wiki/Rope_(data_structure) - https://en.wikipedia.org/wiki/Gap_buffer - https://en.wikipedia.org/wiki/Piece_table - https://en.wikipedia.org/wiki/Skip_list#:~:text=Skip%20lists%20are%20a%20probabilistic,faster%20and%20use%20less%20space. ## ML-AI - 22 Free Image Datasets for Computer Vision | iMerit https://imerit.net/blog/22-free-image-datasets-for-computer-vision-all-pbm/ - Keras: the Python deep learning API https://keras.io/ - https://www.kaggle.com/ Kaggle: Your Machine Learning and Data Science Community - Machine Learning & Data Science Forum Discussions | Kaggle https://www.kaggle.com/discussion - Text Classification with Movie Reviews  |  TensorFlow Hub https://www.tensorflow.org/hub/tutorials/tf2_text_classification - Deep Learning https://www.deeplearningbook.org/ - Neural networks and deep learning http://neuralnetworksanddeeplearning.com/ - Theoretical and Advanced Machine Learning  |  TensorFlow https://www.tensorflow.org/resources/learn-ml/theoretical-and-advanced-machine-learning - Vosk Installation https://alphacephei.com/vosk/install - SC2 AI Arena https://sc2ai.net/ ### Datasets - Top 20 Free Satellite Imagery Sources: Update For 2021 https://eos.com/blog/free-satellite-imagery-sources/ - LabelMe Dataset | Papers With Code https://paperswithcode.com/dataset/labelme - OpenCV: Automatic License/Number Plate Recognition (ANPR) with Python - PyImageSearch https://pyimagesearch.com/2020/09/21/opencv-automatic-license-number-plate-recognition-anpr-with-python/ - DOB Complaints Received | NYC Open Data https://data.cityofnewyork.us/Housing-Development/DOB-Complaints-Received/eabe-havv - DOB Violations | NYC Open Data https://data.cityofnewyork.us/Housing-Development/DOB-Violations/3h2n-5cm9 - DOB ECB Violations | NYC Open Data https://data.cityofnewyork.us/Housing-Development/DOB-ECB-Violations/6bgk-3dad - Housing Maintenance Code Violations | NYC Open Data https://data.cityofnewyork.us/Housing-Development/Housing-Maintenance-Code-Violations/wvxf-dwi5 - DOHMH New York City Restaurant Inspection Results | NYC Open Data https://data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/43nn-pn8j - Three-Quarter Housing Report – Violations | NYC Open Data https://data.cityofnewyork.us/City-Government/Three-Quarter-Housing-Report-Violations/96te-xmyw - Bedbug Reporting | NYC Open Data https://data.cityofnewyork.us/Housing-Development/Bedbug-Reporting/wz6d-d3jb - HPD Building Info https://hpdonline.hpdnyc.org/HPDonline/provide_address.aspx - Results for "Bedbug Reporting" | Page 1 of 1 | NYC Open Data https://data.cityofnewyork.us/browse?q=Bedbug%20Reporting&sortBy=relevance - NYPD Hate Crimes | NYC Open Data https://data.cityofnewyork.us/Public-Safety/NYPD-Hate-Crimes/bqiq-cu78 - Citywide Crime Statistics | NYC Open Data https://data.cityofnewyork.us/Public-Safety/Citywide-Crime-Statistics/c5dk-m6ea - Police Precincts | NYC Open Data https://data.cityofnewyork.us/Public-Safety/Police-Precincts/78dh-3ptz - Crime Enforcement Activity | NYC Open Data https://data.cityofnewyork.us/Public-Safety/Crime-Enforcement-Activity/qk6i-zcht - NYC Park Crime Data | NYC Open Data https://data.cityofnewyork.us/Public-Safety/NYC-Park-Crime-Data/ezds-sqp6 - NYPD Arrest Data (Year to Date) | NYC Open Data https://data.cityofnewyork.us/Public-Safety/NYPD-Arrest-Data-Year-to-Date-/uip8-fykc - NYPD Shooting Incident Data (Historic) | NYC Open Data https://data.cityofnewyork.us/Public-Safety/NYPD-Shooting-Incident-Data-Historic-/833y-fsy8 - NYPD Shooting Incident Data (Year To Date) | NYC Open Data https://data.cityofnewyork.us/Public-Safety/NYPD-Shooting-Incident-Data-Year-To-Date-/5ucz-vwe8 - NYPD Complaint Map (Year to Date) | NYC Open Data https://data.cityofnewyork.us/Public-Safety/NYPD-Complaint-Map-Year-to-Date-/2fra-mtpn ### Audio ML - machine learning audio voice recognition - Google Search https://www.google.com/search?q=machine+learning+audio+voice+recognition&client=firefox-b-1-d&sca_esv=560538966&sxsrf=AB5stBhn-QJFkitiph5Ju1c7IthLyZtvUQ%3A1693172057035&ei=WcHrZKrYAbCo5NoPrfit6AQ&ved=0ahUKEwjq6IK45f2AAxUwFFkFHS18C00Q4dUDCA8&uact=5&oq=machine+learning+audio+voice+recognition&gs_lp=Egxnd3Mtd2l6LXNlcnAiKG1hY2hpbmUgbGVhcm5pbmcgYXVkaW8gdm9pY2UgcmVjb2duaXRpb24yBRAhGKABMgUQIRigATIFECEYoAEyBRAhGKsCMgUQIRirAjIIECEYFhgeGB0yCBAhGBYYHhgdMggQIRgWGB4YHUjISVCtB1jwR3ABeAGQAQCYAWqgAeEWqgEEMzguMrgBA8gBAPgBAcICChAAGEcY1gQYsAPCAgQQIxgnwgIHECMYigUYJ8ICCBAAGIoFGJECwgIHEC4YigUYQ8ICCxAuGIAEGLEDGIMBwgIREC4YgAQYsQMYgwEYxwEY0QPCAgsQABiABBixAxiDAcICCBAAGIAEGLEDwgILEC4YgwEYsQMYgATCAgsQLhiKBRixAxiDAcICCBAuGIAEGLEDwgILEAAYigUYsQMYkQLCAhQQLhiKBRixAxiDARjHARjRAxiRAsICDRAAGIoFGLEDGIMBGEPCAgcQABiKBRhDwgINEC4YigUYxwEY0QMYQ8ICCxAAGIoFGLEDGIMBwgIOEAAYgAQYsQMYgwEYyQPCAgoQABiKBRixAxhDwgIQEC4YigUYsQMYxwEYrwEYQ8ICDRAuGIoFGLEDGIMBGEPCAggQABiABBjJA8ICCBAAGIoFGJIDwgILEAAYgAQYsQMYyQPCAhYQLhiKBRhDGJcFGNwEGN4EGOAE2AEBwgIFEAAYgATCAgUQLhiABMICBhAAGBYYHuIDBBgAIEGIBgGQBgi6BgYIARABGBQ&sclient=gws-wiz-serp - Audio Deep Learning Made Simple: Automatic Speech Recognition (ASR), How it Works | by Ketan Doshi | Towards Data Science https://towardsdatascience.com/audio-deep-learning-made-simple-automatic-speech-recognition-asr-how-it-works-716cfce4c706 - voice recognition | Kaggle https://www.kaggle.com/code/tduan007/voice-recognition - attention model - Google Search https://www.google.com/search?client=firefox-b-1-d&q=attention+model - What is Automatic Speech Recognition? | NVIDIA Technical Blog https://developer.nvidia.com/blog/essential-guide-to-automatic-speech-recognition-technology/ - Automatic speech recognition https://huggingface.co/docs/transformers/tasks/asr - Models - Hugging Face https://huggingface.co/models?pipeline_tag=audio-classification&sort=trending - facebook/wav2vec2-base · Hugging Face https://huggingface.co/facebook/wav2vec2-base - Models - Hugging Face https://huggingface.co/models?pipeline_tag=audio-classification - MIT/ast-finetuned-audioset-10-10-0.4593 · Hugging Face https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593 - [2104.01778] AST: Audio Spectrogram Transformer https://arxiv.org/abs/2104.01778 - Vision Transformer (ViT) https://huggingface.co/docs/transformers/model_doc/vit - Biometric Authentication | Strong Customer Authentication | Nuance https://www.nuance.com/omni-channel-customer-engagement/authentication-and-fraud-prevention/biometric-authentication.html - AI to Detect Speaker in a Speech. Using AI to detect the speaker in a… | by Abhishek Mungoli | Towards Data Science https://towardsdatascience.com/ai-to-detect-speaker-in-a-speech-a1dae5b597b0 - Machine Learning for Audio Classification | Engineering Education (EngEd) Program | Section https://www.section.io/engineering-education/machine-learning-for-audio-classification/ - (49) Machine Learning for audio classification - YouTube https://www.youtube.com/watch?v=GxBG4wUWf4w - Voice classification using Deep Learning, with Python | by Piero Paialunga | Towards Data Science https://towardsdatascience.com/voice-classification-using-deep-learning-with-python-6eddb9580381 - Audio classification  |  TensorFlow Lite https://www.tensorflow.org/lite/examples/audio_classification/overview - What is Audio Classification? - Hugging Face https://huggingface.co/tasks/audio-classification - superb/wav2vec2-base-superb-sid · Hugging Face https://huggingface.co/superb/wav2vec2-base-superb-sid - What’s new with Firefox https://www.mozilla.org/en-US/firefox/118.0.1/whatsnew/?oldversion=117.0.1&utm_medium=firefox-desktop&utm_source=update&utm_campaign=118 ### Charity datasets - About Us | Charity Navigator https://www.charitynavigator.org/about-us/ - CN_Strategic_Plan_2026.pdf https://www.charitynavigator.org/content/dam/cn/cn/pdf/CN_Strategic_Plan_2026.pdf - charity api - Brave Search https://search.brave.com/search?q=charity+api&source=desktop - CharityAPI - Charity Data API https://www.charityapi.org/ - Data Imports – CharityAPI.org API Reference https://docs.charityapi.org/#data-imports - Charity Digital - Topics - The best open data sets available to charities https://charitydigital.org.uk/topics/topics/the-best-open-data-sets-available-to-charities---7628 - charity data on data.world | 12 datasets available https://data.world/datasets/charity - Which public datasets are most useful for funders? - 360Giving https://www.threesixtygiving.org/2021/02/19/which-public-datasets-are-most-useful-for-funders/ - Useful Datasets | London Funders https://londonfunders.org.uk/resources-publications/tools-funders/useful-datasets - irs charity database - Brave Search https://search.brave.com/search?q=irs+charity+database&source=desktop - Tax Exempt Organization Search | Internal Revenue Service https://www.irs.gov/charities-non-profits/tax-exempt-organization-search - Charities and nonprofits | Internal Revenue Service https://www.irs.gov/charities-and-nonprofits - Charitable Organizations | Internal Revenue Service https://www.irs.gov/charities-non-profits/charitable-organizations - data on how much charities receive - Brave Search https://search.brave.com/search?q=data+on+how+much+charities+receive&source=desktop - 2023 Charitable Giving Statistics, Trends & Data: The Ultimate List of Charity Giving Stats | Nonprofits Source https://nonprofitssource.com/online-giving-statistics/ - charity nonprofit income reports - Brave Search https://search.brave.com/search?q=charity+nonprofit+income+reports&source=desktop - where can i find nonprofit financial statements - Brave Search https://search.brave.com/search?q=where+can+i+find+nonprofit+financial+statements&source=web - Where can I find an organization’s Form 990 or 990-PF? | Knowledge base | Candid Learning https://learning.candid.org/resources/knowledge-base/finding-990-990-pfs/ - Where can I find historical tax returns and annual reports for foundations? | Knowledge base | Candid Learning https://learning.candid.org/resources/knowledge-base/historical-returns/ - Where can I find an organization’s Form 990 or 990-PF? | Knowledge base | Candid Learning https://learning.candid.org/resources/knowledge-base/finding-990-990-pfs/ - A-Z Databases: Data & Statistics https://iupui.libguides.com/az.php?t=10336&a=n - Datasets · National Center for Charitable Statistics https://nccs.urban.org/nccs/datasets/ - NCCS Core Series · National Center for Charitable Statistics https://nccs.urban.org/nccs/datasets/core/ - EZProxy Authentication https://proxyauth.uits.iu.edu/auth/ulib.pl?url=https://www.icpsr.umich.edu/NACJD/ - NVSS - National Vital Statistics System Homepage https://www.cdc.gov/nchs/nvss/index.htm - Forms 990-PF Request - IUPUI University Library https://inulib-fireform.eas.iu.edu/online/form/index/archives990pf - Archives of the Foundation Center Historical Foundation Collection :: IUPUI University Library https://eddie.ulib.iupui.edu/fc/ - Ask an Archivist | University Library https://ulib.iupui.edu/special/forms/ask_an_archivist ## Uncategorized - http://mikhailian.mova.org/node/284 - https://www.jetbrains.com/lp/devecosystem-2023/ - https://os.phil-opp.com/ - https://go.dev/learn/ - https://gobyexample.com/ - https://www.ibm.com/topics/oltp oltp olap explanation - https://duckdb.org/docs/guides/import/query_postgres duckdb docs - https://www.vldb.org/pvldb/vol15/p3535-gaffney.pdf sqlite duckdb comparison - https://github.com/gunnarmorling/1brc 1 billion row challenge - https://blog.janestreet.com/the-joy-of-expect-tests/ - https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocument_inlineValue - https://code.visualstudio.com/api/language-extensions/language-server-extension-guide#common-questions - https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocument_inlineValue - https://code.visualstudio.com/api/language-extensions/language-server-extension-guide#common-questions - https://testcontainers.com/getting-started/ - https://www.docker.com/blog/8-top-docker-tips-tricks-for-2024/ - cpp async promise future coroutine - https://github.com/fuhsjr00/bug.n - windows tiling window manager. open source and configurable - The Open Source Definition | Open Source Initiative https://opensource.org/osd - Mechanical-Intelligence-Volume-1.pdf - 1948-intelligent-machinery.pdf https://hashingit.com/elements/research-resources/1948-intelligent-machinery.pdf - Oh shit, git! https://app.gumroad.com/d/c2bc05bdac498abc657d70f658e49a93 - The Art of Plain Text https://www.netmeister.org/blog/the-art-of-plain-text.html - Welcome to Web 3.0! https://welcome2web3.com/# - On the weaponisation of open source | Tales about Software Engineering https://beny23.github.io/posts/on_weaponisation_of_open_source/ - wizard-zine.pdf https://jvns.ca/wizard-zine.pdf - How to use analogies for good not evil https://dynomight.net/analogies/ - Netflix https://www.netflix.com/browse - (126) C++ Weekly - Ep 404 - How (and Why) To Write Code That Avoids std::move - YouTube https://www.youtube.com/watch?v=6SaUwqw4ueE - Building an ECS #1: Where are my Entities and Components | by Sander Mertens | Medium https://ajmmertens.medium.com/building-an-ecs-1-where-are-my-entities-and-components-63d07c7da742 - freebsd-src/lib/libc/db at main · freebsd/freebsd-src · GitHub https://github.com/freebsd/freebsd-src/tree/main/lib/libc/db - Berkeley DB Tutorial and Reference Guide (Version: 4.1.24) http://web.mit.edu/ghudson/dev/third/rpm/db/docs/reftoc.html - Berkeley DB Reference Guide: The big picture http://web.mit.edu/ghudson/dev/third/rpm/db/docs/ref/arch/bigpic.html - Combinatorial optimization - Wikipedia https://en.wikipedia.org/wiki/Combinatorial_optimization - Knapsack problem - Wikipedia https://en.wikipedia.org/wiki/Knapsack_problem - 287: NP-Complete - explain xkcd https://www.explainxkcd.com/wiki/index.php/287:_NP-Complete - Design Patterns in Python https://refactoring.guru/design-patterns/python - Dev container Features contribution and discovery https://containers.dev/implementors/features-distribution/ - Marvel Unlimited | Over 30,000 Comics. One All-New App! https://www.marvel.com/unlimited - (5) How to Become a Good Backend Engineer (Fundamentals) - YouTube https://www.youtube.com/watch?v=V3ZPPPKEipA - Table of Contents · Crafting Interpreters https://craftinginterpreters.com/contents.html - Increase Your Developer Productivity by Building a Toolchain | FEM | Frontend Masters https://frontendmasters.com/courses/developer-productivity/ - TypeScript, Golang, & Rust Side-by-Side | Learn Polyglot Programming with TypeScript, Go & Rust | Frontend Masters https://frontendmasters.com/courses/typescript-go-rust/ - journal.stuffwithstuff.com http://journal.stuffwithstuff.com/ - Engineering & Design – Discord Blog https://blog.discord.com/engineering-posts/home - How Discord Stores Billions of Messages | by Stanislav Vishnevskiy | Discord Blog https://blog.discord.com/how-discord-stores-billions-of-messages-7fa6ec7ee4c7 - xkcd: Deviled Eggs https://xkcd.com/ - So Fucking Agile https://www.sofuckingagile.com/ - DYNOMIGHT INTERNET WEBSITE https://dynomight.net/ - Industrial Empathy https://www.industrialempathy.com/ - jenniferlynparsons/awesome-writing: An awesome list of information to help developers write better, kinder, more helpful documentation and learning materials https://github.com/jenniferlynparsons/awesome-writing - Computer Science Conference Rankings https://webdocs.cs.ualberta.ca/~zaiane/htmldocs/ConfRanking.html - Google Scholar https://scholar.google.com/schhp?hl=en&as_sdt=0,36 - Information and Entropy | Electrical Engineering and Computer Science | MIT OpenCourseWare https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-050j-information-and-entropy-spring-2008/ - Great Ideas in Theoretical Computer Science | Electrical Engineering and Computer Science | MIT OpenCourseWare https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-080-great-ideas-in-theoretical-computer-science-spring-2008/ - Introduction to C++ | Electrical Engineering and Computer Science | MIT OpenCourseWare https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-096-introduction-to-c-january-iap-2011/ - Michael Abrash’s Graphics Programming Black Book, Special Edition https://www.jagregory.com/abrash-black-book/ - Working alone is so exhausting so I created my own assistant - DEV Community https://dev.to/happping_min/working-alone-is-so-exhausting-so-i-created-my-own-assistant-4mki - What Are Python Wheels and Why Should You Care? – Real Python https://realpython.com/python-wheels/ - Why 4 Bloomberg engineers wrote another C++ book | Tech At Bloomberg https://www.techatbloomberg.com/blog/why-4-bloomberg-engineers-wrote-another-cplusplus-book/ - Microsoft Word - Large Object Storage - Camera Ready.doc - 0701168.pdf https://arxiv.org/ftp/cs/papers/0701/0701168.pdf - Crafting Interpreters http://craftinginterpreters.com/ - Laws Order: The Web Page http://www.daviddfriedman.com/laws_order/index.shtml - Game Programming Patterns http://gameprogrammingpatterns.com/ - Church-Turing thesis - Esolang https://esolangs.org/wiki/Church-Turing_thesis - Why 4 Bloomberg engineers wrote another C++ book | Tech At Bloomberg https://www.techatbloomberg.com/blog/why-4-bloomberg-engineers-wrote-another-cplusplus-book/ - How We Work: Bloomberg's Industry Verticals Goes Agile - YouTube https://www.youtube.com/watch?v=l97M_AGdjRk - The Hardest Program I’ve Ever Written – journal.stuffwithstuff.com http://journal.stuffwithstuff.com/2015/09/08/the-hardest-program-ive-ever-written/ - The 9 Deep Learning Papers You Need To Know About (Understanding CNNs Part 3) – Adit Deshpande – Engineering at Forward | UCLA CS '19 https://adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html - ImageNet Classification with Deep Convolutional Neural Networks - NIPS-2012-imagenet-classification-with-deep-convolutional-neural-networks-Paper.pdf https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf - Abuse and harassment on the blockchain https://blog.mollywhite.net/abuse-and-harassment-on-the-blockchain/ - Why is it so hard to buy things that work well? https://danluu.com/nothing-works/ - Microsoft Word - jeiSigmoid.doc - PAP07.pdf http://markfairchild.org/PDFs/PAP07.pdf - Microsoft Word - 10-nagla.doc - download https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.446.6337&rep=rep1&type=pdf - How I learned to stop worrying and structure all writing as a list https://dynomight.net/lists/ - Host a Website on Raspberry Pi https://fireship.io/lessons/host-website-raspberry-pi/ - The Ancient Secrets of Computer Vision - An Introduction to Computer Vision https://pjreddie.com/courses/computer-vision/ - How Not To Sort By Average Rating – Evan Miller https://www.evanmiller.org/how-not-to-sort-by-average-rating.html - How I'm able to take notes in mathematics lectures using LaTeX and Vim | Gilles Castel https://castel.dev/post/lecture-notes-1/#postfix-snippets - Why Don't You Use ... https://www.brendangregg.com/blog/2022-03-19/why-dont-you-use.html - (1) 10 very promising Open Source Projects you haven’t heard of - YouTube https://www.youtube.com/watch?v=qXUl3VsbA6o - How I Built my Blog using MDX, Next.js, and React https://www.joshwcomeau.com/blog/how-i-built-my-blog/ - Pearl Leff | In Praise of Memorization http://www.pearlleff.com/in-praise-of-memorization - Git https://git-scm.com/ - StackOverflow.org https://stackoverflow.org/ - Why You Should Start Self Hosting | Rohan Deshmukh https://rohanrd.xyz/posts/why-you-should-start-self-hosting/ - How Does a Database Work? | Let’s Build a Simple Database https://cstack.github.io/db_tutorial/ - Chrome Developers https://developer.chrome.com/ - Superstimuli and the Collapse of Western Civilization - LessWrong https://www.lesswrong.com/s/MH2b8NfWv22dBtrs8/p/Jq73GozjsuhdwMLEG - Supernormal Stimuli: Your Brain on Porn, Junk Food, and the Internet https://www.sparringmind.com/supernormal-stimuli/ - the morning paper | a random walk through Computer Science research, by Adrian Colyer https://blog.acolyer.org/ - Design Docs at Google https://www.industrialempathy.com/posts/design-docs-at-google/ - Microsoft Word - EC Requirements Proposal FINAL to Exec Comm 6-11-16.docx - pga_179129.pdf https://sites.nationalacademies.org/cs/groups/pgasite/documents/webpage/pga_179129.pdf - Design Document - Google Docs https://docs.google.com/document/u/1/d/1NsTPrcaHVsol_GgrGRvzG_OQG7Qpg9uzfZjPuyQnHuM/edit?disco=AAAAWV4OzO8&usp=comment_email_document&ts=623f4a02&usp_dm=true - jenniferlynparsons/awesome-writing: An awesome list of information to help developers write better, kinder, more helpful documentation and learning materials https://github.com/jenniferlynparsons/awesome-writing - C++20 coroutines explained simply https://nmilo.ca/blog/coroutines.html - OLI – Transforming higher education through the science of learning. https://oli.cmu.edu/ - cgtnotes.pdf https://www.math.colostate.edu/~hulpke/CGT/cgtnotes.pdf - How is CMake used? - Stack Overflow https://stackoverflow.com/questions/26007566/how-is-cmake-used - 1612.09375.pdf https://arxiv.org/pdf/1612.09375.pdf - 1612.09375.pdf https://arxiv.org/pdf/1612.09375.pdf - Valgrind - Wikipedia https://en.wikipedia.org/wiki/Valgrind - How do I profile C++ code running on Linux? - Stack Overflow https://stackoverflow.com/questions/375913/how-do-i-profile-c-code-running-on-linux - TCB Scans https://tcbscans.com/ - Why Rust is the most admired language among developers - The GitHub Blog https://github.blog/2023-08-30-why-rust-is-the-most-admired-language-among-developers/ - distcc with CMake/QMake to speed-up your builds – Embedded bits and pixels https://ortogonal.github.io/cpp/distcc-cmake-qmake/ - MSzturc/obsidian-advanced-slides: Create markdown-based reveal.js presentations in Obsidian https://github.com/MSzturc/obsidian-advanced-slides - Learn more — Advanced Slides Documentation https://mszturc.github.io/obsidian-advanced-slides/getting-start/learnmore/ - epwalsh/obsidian.nvim: Obsidian 🤝 Neovim https://github.com/epwalsh/obsidian.nvim - Download - Obsidian https://obsidian.md/download - Introduction to the Basics · Modern CMake https://cliutils.gitlab.io/modern-cmake/chapters/basics.html - (180) Effective Neovim: Instant IDE - YouTube https://www.youtube.com/watch?v=stqUbv-5u2s - (180) 0 to LSP : Neovim RC From Scratch - YouTube https://www.youtube.com/watch?v=w7i4amO_zaE - Deploying distcc in a kubernetes cluster to transform it into a big, fast build-machine. https://lastviking.eu/distcc_with_k8.html - GDC Vault - AI Navigation: It's Not a Solved Problem - Yet https://www.gdcvault.com/play/1014514/AI-Navigation-It-s-Not - (3) 0 to LSP : Neovim RC From Scratch - YouTube https://www.youtube.com/watch?v=w7i4amO_zaE&t=832s - tldr pages https://tldr.sh/ - How Does a Database Work? | Let’s Build a Simple Database https://cstack.github.io/db_tutorial/ - (3) 01 - Course Introduction & Relational Model (CMU Databases Systems / Fall 2019) - YouTube https://www.youtube.com/watch?v=oeYBdghaIjc&list=PLSE8ODhjZXjbohkNBWQs_otTrBTrjyohi&index=2 - (3) 01 - History of Databases (CMU Databases / Spring 2020) - YouTube https://www.youtube.com/watch?v=SdW5RKUboKc&list=PLSE8ODhjZXjasmrEd2_Yi1deeE360zv5O&index=2 - (3) Neovim With AstroNvim | Your New Advanced Development Editor - YouTube https://www.youtube.com/watch?v=GEHPiZ10gOk - Using buffers, windows, and tabs efficiently in Vim - DEV Community https://dev.to/iggredible/using-buffers-windows-and-tabs-efficiently-in-vim-56jc - how it works https://bigprimes.org/how-it-works - foo.bar https://foobar.withgoogle.com/ - Codeforces https://codeforces.com/ - AtCoder https://atcoder.jp/ - CPH.pdf https://usaco.guide/CPH.pdf - CLIST https://clist.by/ - Code Jam - Google’s Coding Competitions https://codingcompetitions.withgoogle.com/codejam - Kick Start - Google’s Coding Competitions https://codingcompetitions.withgoogle.com/kickstart - About - Project Euler https://projecteuler.net/ - Contests - Codeforces https://codeforces.com/group/yg7WhsFsAp/contests - Blog entries - Codeforces https://codeforces.com/blog/Errichto - OverTheWire: Wargames https://overthewire.org/wargames/ - CLIST https://clist.by/ - CTFtime.org / All about CTF (Capture The Flag) https://ctftime.org/ - Google’s Coding Competitions - Code Jam, Hash Code and Kick Start https://codingcompetitions.withgoogle.com/ - Introduction · CTF Field Guide https://trailofbits.github.io/ctf/ - One Piece Chapter 1064 Page 19 https://cdn.onepiecechapters.com/file/CDN-M-A-N/op_1064_egg_017.png - A Tree-Walk Interpreter · Crafting Interpreters https://craftinginterpreters.com/a-tree-walk-interpreter.html - Curry–Howard correspondence - Wikipedia https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence - Introduction · Crafting Interpreters https://craftinginterpreters.com/introduction.html - Table of Contents · Crafting Interpreters https://craftinginterpreters.com/contents.html - CS 6120: A Unified Theory of Garbage Collection https://www.cs.cornell.edu/courses/cs6120/2019fa/blog/unified-theory-gc/ - The next 700 programming languages - landin-next-700.pdf https://homepages.inf.ed.ac.uk/wadler/papers/papers-we-love/landin-next-700.pdf - Waterbed Theory http://wiki.c2.com/?WaterbedTheory - Extended Backus–Naur form - Wikipedia https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form - Why was the C syntax for arrays, pointers, and functions designed this way? - Software Engineering Stack Exchange https://softwareengineering.stackexchange.com/questions/117024/why-was-the-c-syntax-for-arrays-pointers-and-functions-designed-this-way - Official page for Language Server Protocol https://microsoft.github.io/language-server-protocol/ - cryptopals-cpp/wecrypt at master · davidscholberg/cryptopals-cpp https://github.com/davidscholberg/cryptopals-cpp/tree/master/wecrypt - Get started with CMake Tools on Linux https://code.visualstudio.com/docs/cpp/cmake-linux - Step 2: Adding a Library — CMake 3.23.1 Documentation https://cmake.org/cmake/help/latest/guide/tutorial/Adding%20a%20Library.html - What modern C++ libraries should be in my toolbox? - Stack Overflow https://stackoverflow.com/questions/777764/what-modern-c-libraries-should-be-in-my-toolbox - Server Not Found http://wiki.libsdl.org/Installation - Lazy Foo' Productions - Hello SDL: Your First Graphics Window https://lazyfoo.net/tutorials/SDL/01_hello_SDL/index2.php - {sdltutorials.com} http://www.sdltutorials.com/ - SDL2 Game Tutorials - parallelrealities.co.uk https://www.parallelrealities.co.uk/tutorials/ - Installing and using packages - vcpkg https://vcpkg.readthedocs.io/en/latest/examples/installing-and-using-packages/ - gdb Tutorial https://www.cs.cmu.edu/~gilpin/tutorial/ - c++ - How to use Libraries - Stack Overflow https://stackoverflow.com/questions/10358745/how-to-use-libraries - cxx-pflR1: The Pitchfork Layout (PFL) https://api.csswg.org/bikeshed/?force=1&url=https://raw.githubusercontent.com/vector-of-bool/pitchfork/develop/data/spec.bs - vscode-cmake-tools/docs at main · microsoft/vscode-cmake-tools https://github.com/microsoft/vscode-cmake-tools/tree/main/docs#cmake-tools-for-visual-studio-code-documentation - linux c++ makefile - Google Search https://www.google.com/search?client=firefox-b-1-d&q=linux+c%2B%2B+makefile - What's a good directory structure for larger C++ projects using Makefile? - Stack Overflow https://stackoverflow.com/questions/2360734/whats-a-good-directory-structure-for-larger-c-projects-using-makefile - c++ debugger guide - Google Search https://www.google.com/search?client=firefox-b-1-d&q=c%2B%2B+debugger+guide - Tutorial: Debug C++ code - Visual Studio (Windows) | Microsoft Docs https://docs.microsoft.com/en-us/visualstudio/debugger/getting-started-with-the-debugger-cpp?view=vs-2022 - Translation units and linkage (C++) | Microsoft Docs https://docs.microsoft.com/en-us/cpp/cpp/program-and-linkage-cpp?view=msvc-170 - (7) What is a general C++ project structure like? | LinkedIn https://www.linkedin.com/pulse/what-general-c-project-structure-like-herbert-elwood-gilliland-iii/ - c++ - How to use Libraries - Stack Overflow https://stackoverflow.com/questions/10358745/how-to-use-libraries - jupyter notebook widgets - Google Search https://www.google.com/search?client=firefox-b-1-d&q=jupyter+notebook+widgets - http://numerical.recipes/ http://numerical.recipes/ - C++ Language Reference | Microsoft Docs https://docs.microsoft.com/en-us/cpp/cpp/cpp-language-reference?view=msvc-170 - 1. Extending Python with C or C++ — Python 3.10.4 documentation https://docs.python.org/3/extending/extending.html - onivim/libvim: libvim: The core Vim editing engine as a minimal C library https://github.com/onivim/libvim - C/C++ Open Source Package Manager https://conan.io/ - C++ Code Smells - Jason Turner - CppCon 2019 - YouTube https://www.youtube.com/watch?v=f_tLQl0wLUM - cppbestpractices/02-Use_the_Tools_Available.md at master · cpp-best-practices/cppbestpractices https://github.com/cpp-best-practices/cppbestpractices/blob/master/02-Use_the_Tools_Available.md - Writing database storage engine from scratch. Part 1. | by Valerii Maslenikov | Medium https://medium.com/@valerii.maslenikov/writing-database-storage-engine-from-scratch-part-1-5303c549c26 - (33) F2023 #03 - Database Storage Part 1 (CMU Intro to Database Systems) - YouTube https://www.youtube.com/watch?v=DJ5u5HrbcMk # Done ## Books - #book Playing to Win - #book Game Programming Patterns - #book Fat Chance - #book Justice - #book Clean Code - #book The Clean Coder - #book The Pragmatic Programmer - #book Design Patterns - #book Atomic habits ## Articles - reading random parts of a big file from disk into memory https://stackoverflow.com/questions/6651503/random-access-of-a-large-binary-file