AI/machine learning technology is growing at a rapid pace. There is a great deal of active research & big tech is leading the way. Luckily there are also a lot of resources out there for the technologist to utilize. So many we had to cherry pick what look like the most legit & useful tools.
- Accord Framework
http://accord-framework.net - Aligned Face Dataset from Pinterest (CCO)
https://www.kaggle.com/frules11/pins-face-recognition - Amazon Reviews Dataset
https://snap.stanford.edu/data/web-Amazon.html - Apache SystemML
https://systemml.apache.org - AWS Open Data
https://registry.opendata.aws - Baidu Apolloscapes
http://apolloscape.auto - Beijing Laboratory of Intelligent Information Technology Vehicle Dataset
http://iitlab.bit.edu.cn/mcislab/vehicledb - Berkley Caffe
http://caffe.berkeleyvision.org - Berkley DeepDrive
https://bdd-data.berkeley.edu - Caltech Dataset
http://www.vision.caltech.edu/html-files/archive.html - Cats in Movies Dataset
https://public.opendatasoft.com/explore/dataset/cats-in-movies/information - Chinese Character Dataset
http://www.iapr-tc11.org/mediawiki/index.php?title=Harbin_Institute_of_Technology_Opening_Recognition_Corpus_for_Chinese_Characters_(HIT-OR3C) - Chinese Text in the Wild Dataset (CC4.0)
https://ctwdataset.github.io - CelebA Dataset (research only)
http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html - Cityscapes Dataset
https://www.cityscapes-dataset.com | License - Clash of Clans User Comments Dataset (GPL 2)
https://www.kaggle.com/moradnejad/clash-of-clans-50000-user-comments - Core ML
https://developer.apple.com/machine-learning - Cornell Movie Dialogs Corpus
http://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html - Deep Learning for Java
https://deeplearning4j.org - Enron Email Dataset
https://www.cs.cmu.edu/~./enron - Facebook AI Tools
https://ai.facebook.com/tools - GitHub Deep Learning
https://github.com/topics/deep-learning - GitHub Machine Learning
https://github.com/topics/machine-learning - GitHub Natural Language Processing
https://github.com/topics/nlp - GitHub Tensorflow
https://github.com/topics/tensorflow - Google Dataset Search
https://toolbox.google.com/datasetsearch - Google Facial Expression Comparison Dataset (CC0 1.0)
https://ai.google/tools/datasets/google-facial-expression - Google Landmarks Dataset
https://www.kaggle.com/google/google-landmarks-dataset - Google ML Kit
https://developers.google.com/ml-kit - Google Open Images Dataset
https://ai.googleblog.com/2016/09/introducing-open-images-dataset.html - Google Teachable Machine
https://teachablemachine.withgoogle.com - H20 AI
https://www.h2o.ai - IBM Watson Starter Kits
https://cloud.ibm.com/developer/watson/starter-kits - IMDB Movie Review Dataset
http://ai.stanford.edu/~amaas/data/sentiment - Imagenet Image Database
http://image-net.org - JVC Video Game Reviews Dataset
https://www.kaggle.com/floval/jvc-game-reviews - Kaggle Datasets
https://www.kaggle.com - Labeled Faces in the Wild
http://vis-www.cs.umass.edu/lfw - LabelMe Dataset
http://labelme.csail.mit.edu/Release3.0/browserTools/php/dataset.php - LISA Traffic Light Dataset (CC BY-NC-SA 4.0)
https://www.kaggle.com/mbornoe/lisa-traffic-light-dataset - Machine Learning Playground
http://ml-playground.com - Machine Learning Showcase
https://ml-showcase.com - Mahout
https://mahout.apache.org - Microsoft Cognitive Toolkit
https://docs.microsoft.com/en-us/cognitive-toolkit - Microsoft Distributed Machine Learning Toolkit
http://www.dmtk.io - Million Song Dataset
http://millionsongdataset.com - MLlib
https://spark.apache.org/mllib - Movie Review Datasets
http://www.cs.cornell.edu/people/pabo/movie-review-data - MovieLens Datasets
https://grouplens.org/datasets/movielens - Mushroom Dataset
https://archive.ics.uci.edu/ml/datasets/mushroom - MXNet
https://mxnet.apache.org - Mycroft
https://mycroft.ai - Natural Earth Data
http://www.naturalearthdata.com/downloads - Numenta
https://numenta.com - ONNX
https://onnx.ai - Open ML Datasets
https://www.openml.org/search?type=data - OpenCyc
https://www.cyc.com/opencyc - OpenNN
http://www.opennn.net - Oryx 2
http://oryx.io - Oxford Robotcar Dataset (CC4.0)
https://robotcar-dataset.robots.ox.ac.uk - PredictionIO
http://predictionio.apache.org - Price of Weed Dataset
https://github.com/frankbi/price-of-weed - PyTorch
https://pytorch.org - Real & Fake Face Detection
https://www.kaggle.com/ciplab/real-and-fake-face-detection - Scikit-learn
https://scikit-learn.org - Shogun
https://www.shogun-toolbox.org - Stanford Cars Dataset
http://ai.stanford.edu/~jkrause/cars/car_dataset.html - Stanford Dogs Dataset
http://vision.stanford.edu/aditya86/ImageNetDogs - Stanford Large Network Dataset Collection
https://snap.stanford.edu/data - Stanford Sentiment Treebank
https://nlp.stanford.edu/sentiment/code.html - The Blog Authorship Corpus (research only)
http://u.cs.biu.ac.il/~koppel/BlogCorpus.htm - The French Lexicon Project
https://sites.google.com/site/frenchlexicon/results - Theano
http://www.deeplearning.net/software/theano - Tensorflow
https://www.tensorflow.org - TME Motorway Dataset (research only)
http://cmp.felk.cvut.cz/data/motorway - Torch
http://torch.ch - Tufts Face Database (research only)
http://tdface.ece.tufts.edu - UCI Machine Learning Repository
http://archive.ics.uci.edu/ml/index.php - UFO Reports Dataset
https://github.com/planetsig/ufo-reports - Vandal Video Game Reviews Dataset
https://www.kaggle.com/floval/12-000-video-game-reviews-from-vandal - Visual Genome
http://visualgenome.org - Wacky Corpus (CC BY-NC-SA 4.0)
https://wacky.sslmit.unibo.it/doku.php?id=corpora - Wine Quality Dataset
https://archive.ics.uci.edu/ml/datasets/wine+quality - World Bank Open Data
https://data.worldbank.org - Yale Face Database (research only)
http://cvc.cs.yale.edu/cvc/projects/yalefaces/yalefaces.html - Yelp Open Dataset (research only)
https://www.yelp.com/dataset - YouTube-8M Segments Dataset
https://research.google.com/youtube8m
Big Tech R&D
- AI2
https://allenai.org - AWS Machine Learning
https://aws.amazon.com/machine-learning - Baidu Research
http://research.baidu.com/Blog - Berkeley Artificial Intelligence Research (BAIR)
https://bair.berkeley.edu - DeepMind
https://deepmind.com - Duolingo AI
https://ai.duolingo.com - Energy.gov
https://www.energy.gov/artificial-intelligence-and-machine-learning - Facebook AI
https://ai.facebook.com - Facebook AI Research
https://research.fb.com/category/facebook-ai-research - GE Artificial Intelligence
https://www.ge.com/research/technology-domains/artificial-intelligence - Google AI
https://ai.google - Google AI & Machine Learning Products
https://cloud.google.com/products/ai - IBM Research AI
https://www.research.ibm.com/artificial-intelligence - Intel AI
https://software.intel.com/en-us/ai - Journal of Artificial Intelligence Research (JAIR)
https://www.jair.org - Microsoft Artificial Intelligence
https://www.microsoft.com/en-us/research/research-area/artificial-intelligence - OpenAI
https://openai.com - Partnership on AI
https://www.partnershiponai.org - TayTweets
https://twitter.com/tayandyou
This data is from Vuild’s list of AI/machine learning tools & datasets. Please visit vuild.com for more.
Your info will only be used for comments. No Gmail.