This is an old revision of the document!
AI-900 Azure AI Fundamentals – Study Notes
For Study
-
- Take the practice assessment exams until you regularly score 90%+.
- AI-900 full course training videos on Microsoft Learn and YouTube.
- General overview of how ChatGPT, DALL·E, and OpenAI work is helpful.
Udemy Resources – Practice Exams
Core Concepts & Notes
Machine Learning Basics
- Regression algorithms are used to predict numeric values.
- Classification algorithms are used to predict categories (which class an input belongs to).
- Clustering algorithms group data points that have similar characteristics.
- Supervised learning uses labeled training data (features + labels).
- Unsupervised learning uses unlabeled data and includes clustering, not regression or classification.
- K-Means clustering is an unsupervised algorithm used for training clustering models.
Datasets, Features & Labels
- Features = input variables used by the model.
- Labels = target values the model predicts.
- Training dataset – features and known label values (used to train the model).
- Validation dataset – features and known label values (used to tune and evaluate the model).
Machine Learning Types (Summary)
- Supervised learning (training data is labeled):
- Regression – label is numeric.
- Classification – label is a category or class.
- Binary classification – two classes (True/False, Yes/No).
- Multiclass classification – more than two classes.
- Unsupervised learning (training data is unlabeled):
- Clustering – grouping similar items together.
Computer Vision
- Computer Vision is used to extract information from images, but it is not a search and indexing solution by itself.
- Semantic segmentation is used to classify individual pixels in an image.
- OCR (Optical Character Recognition) and Spatial Analysis are part of the Azure AI Vision service.
- Object detection provides the ability to generate bounding coordinates (bounding boxes) as part of its output.
Natural Language Processing (NLP)
- Stemming or lemmatization normalizes words for counting and analysis.
- Frequency analysis counts how often a word appears in a text.
- N-grams extend frequency analysis to multi-term phrases.
- Vectorization represents words/documents as vectors in N-dimensional space to capture relationships.
- Extracting key phrases from text helps identify the main terms in NLP.
- Data mining workloads focus on searching and indexing large amounts of data.
- Knowledge mining is an AI workload that makes large amounts of data searchable.
- Conversational AI is part of NLP and facilitates the creation of chatbots.
- Language models predict the next word in a sequence of words based on context.
Azure OpenAI & Generative AI Services
- DALL·E generates images from natural language prompts.
- GPT-3 / GPT-3.5 can understand natural language and code, but do not generate images.
- Embeddings convert text into numeric vector representations, used to:
- Classify text
- Search text
- Compare similarity between texts
- Whisper can transcribe and translate speech.
- GPT models are strong at understanding and creating natural language.
- System messages (in chat-style interactions) define constraints, style, and behavior for Gen AI responses.
Speech Services
- Speech recognition converts spoken language into text.
- Speech recognition can use audio data to identify distinct user voices.
- Speech synthesis (Text-to-Speech, TTS) converts written text into spoken language.
- Conversational AI–enabled devices can:
- Engage in natural language conversations with users.
- Understand user queries and provide relevant responses.
- Make interactions more human-like and intuitive.
Principles of Responsible AI
- Accountability – systems are designed to meet ethical and legal standards.
- Privacy and Security – protect any personal and/or sensitive data.
- Inclusiveness – empower people in a positive and engaging way.
- Fairness – ensure that all users of the system are treated fairly.
Azure Machine Learning Designer
- Create a pipeline before using Machine Learning Designer to train a model.
- Classification, regression, and time-series forecasting are all supervised machine learning models.
Additional Key Points
- Computer Vision:
- Used for image understanding, not for full search/indexing solutions on its own.
- Azure AI Vision:
- Includes OCR and Spatial Analysis capabilities.
- Knowledge Mining & Data Mining:
- Focus on searching, indexing, and making data searchable across large content stores.