This is an old revision of the document!

AI-900 Azure AI Fundamentals – Study Notes

For Study

Microsoft Learn – Azure AI Fundamentals (AI-900)
- Take the practice assessment exams until you regularly score 90%+.
AI-900 full course training videos on Microsoft Learn and YouTube.
What is Azure AI Foundry?
Azure Machine Learning Studio – AI Show episodes
General overview of how ChatGPT, DALL·E, and OpenAI work is helpful.

Regression algorithms are used to predict numeric values.
Classification algorithms are used to predict categories (which class an input belongs to).
Clustering algorithms group data points that have similar characteristics.
Supervised learning uses labeled training data (features + labels).
Unsupervised learning uses unlabeled data and includes clustering, not regression or classification.
K-Means clustering is an unsupervised algorithm used for training clustering models.

Features = input variables used by the model.
Labels = target values the model predicts.
Training dataset – features and known label values (used to train the model).
Validation dataset – features and known label values (used to tune and evaluate the model).

Supervised learning (training data is labeled):
- Regression – label is numeric.
- Classification – label is a category or class.
  - Binary classification – two classes (True/False, Yes/No).
  - Multiclass classification – more than two classes.
Unsupervised learning (training data is unlabeled):
- Clustering – grouping similar items together.

Computer Vision is used to extract information from images, but it is not a search and indexing solution by itself.
Semantic segmentation is used to classify individual pixels in an image.
OCR (Optical Character Recognition) and Spatial Analysis are part of the Azure AI Vision service.
Object detection provides the ability to generate bounding coordinates (bounding boxes) as part of its output.

Stemming or lemmatization normalizes words for counting and analysis.
Frequency analysis counts how often a word appears in a text.
N-grams extend frequency analysis to multi-term phrases.
Vectorization represents words/documents as vectors in N-dimensional space to capture relationships.
Extracting key phrases from text helps identify the main terms in NLP.
Data mining workloads focus on searching and indexing large amounts of data.
Knowledge mining is an AI workload that makes large amounts of data searchable.
Conversational AI is part of NLP and facilitates the creation of chatbots.
Language models predict the next word in a sequence of words based on context.

DALL·E generates images from natural language prompts.
GPT-3 / GPT-3.5 can understand natural language and code, but do not generate images.
Embeddings convert text into numeric vector representations, used to:
- Classify text
- Search text
- Compare similarity between texts
Whisper can transcribe and translate speech.
GPT models are strong at understanding and creating natural language.
System messages (in chat-style interactions) define constraints, style, and behavior for Gen AI responses.

Speech recognition converts spoken language into text.
Speech recognition can use audio data to identify distinct user voices.
Speech synthesis (Text-to-Speech, TTS) converts written text into spoken language.
Conversational AI–enabled devices can:
- Engage in natural language conversations with users.
- Understand user queries and provide relevant responses.
- Make interactions more human-like and intuitive.

Accountability – systems are designed to meet ethical and legal standards.
Privacy and Security – protect any personal and/or sensitive data.
Inclusiveness – empower people in a positive and engaging way.
Fairness – ensure that all users of the system are treated fairly.

Create a pipeline before using Machine Learning Designer to train a model.
Classification, regression, and time-series forecasting are all supervised machine learning models.

Computer Vision:
- Used for image understanding, not for full search/indexing solutions on its own.
Azure AI Vision:
- Includes OCR and Spatial Analysis capabilities.
Knowledge Mining & Data Mining:
- Focus on searching, indexing, and making data searchable across large content stores.