feature(notizen): add notes from l2 morning

2026-04-30 10:26:07 +02:00
parent 1c8c4e5142
commit 5a7f4dfe38
2 changed files with 139 additions and 1 deletions
@@ -1,6 +1,6 @@
 # Notizen Lektion 1

->Thema: Einführung Practical Machine Learning
+>Thema: Einführung Practical Machine Learning 1
 >Datum: 22.04.2026
 >Dozent: Jürgen Vogel

@@ -0,0 +1,138 @@
+# Notizen Lektion 2
+
+>Thema: Einführung Practical Machine Learning 2
+>Datum: 22.04.2026
+>Dozent: Jürgen Vogel
+
+## recap
+
+> [!NOTE] Definition
+algorithm that learns from experience E to solve some tasks T with performance P and P improves with E
+
+- Model
+    - represents the solution to the tasks T
+    - is learnt and adapted based on E
+    - can be evaluated with respect to P
+- Features
+    - are the relevant part of the data E for creating the model
+    - may have to be designed explicitly depending on the ML algorithm
+- Categorization with respect to
+    - experience E: supervised vs. unsupervised vs. reinforcement leanring
+    - tasks T: clustering vs. classification vs. regresseion
+    - human-readable model: white box vs. black box
+- Project
+    - agile/iterative development (CRISP-DM)
+- Key Challenges
+    - definition of T that is both solvable and generates value
+    - large amounts of high quality data E
+    - feature engineering
+    - dealing with 95% models
+
+## Evaluation
+
+### How good is the machine learning system?
+
+- returned result is good if it solves the problem at hand
+    - may be qualitative or quantitative
+    - may be subjective (user need, context, and preferences)
+    - may change over time
+    - also depends on factors such as credibility, specificity, exhaustivitiy, recency, clarity, interpretability... of the result
+
+- Beispiel Suchmaschine: Eine Reihe von Keywords werden in eine Suchmaschine eingegeben
+    - Wann ist die Antwort der Suchmaschine "gut"? 
+        - Schwirig zu beantworten, da es sich von Nutzer zu Nutzer unterscheided
+    - Casual User: Frage aus generellem Context -> generellere Antwort okay
+        - "Wo ist in Laufdistanz ein Restaurant, das offen ist"
+            - Man will nicht das bestmögliche Setting finden und alle Restaurants finde
+        - Schnelles Ergebnis und gut genug
+    - Expert User: Recherchiert sehr detailierte Informationen
+        - Umfangreiche Analyse machen 
+        - Was gibts alles für wiss. Literatur zum Thema?
+        - Was sind die besten Verfahren?
+        - Informationsbedürfnis sehr hoch
+
+- thus, the ML system needs to be assessed in "real-life" situations
+    - often with user involvement
+    - similar methods as with user requirements research
+        - usability tests, interviews, field studies, log analysis
+    - but this takes time and is costly
+
+### Metrics SR/ER
+
+- Wichtig:
+    - Success Rate
+    - Error Rate
+
+- Success
+    - Result is correct -> ein einzelnes Sample ist richtig klassifiziert worden
+    - success rate -> durschnitt über grössere Menge samples
+        - nennt man auch accuracy
+- Error
+    - Result is incorrect -> ein einzelnes Sample ist falsch
+    - error rate -> durschnitt über grössere Menge samples
+
+- Beides ist eine 1/0 Betrachtung -> Entweder falsch oder richtig
+
+- Bsp: Wie viele Personen sind auf Bild
+    - Modell sagt 3 Personen
+    - Auf Bild sind 5 Personen
+    - Wie bewertet man das?
+        - falsch? -> 100% error
+        - ein bisschen richtig? 3/5 erkannt 2/5 fehler
+
+- Generalisieren wir die Erfolgsrate erhält man
+    - our ML system takes some test data D as input and produces some results
+        - D -> {r'1, ... r'n}
+        - e.g. if r'i are from a list of predefined labels , we call this classification
+    - the test data also includes the expected result "gold standard"
+        - D -> {r1, ..., rn}
+    - for the test setting, we define some comparison functions
+        - c(r, r') = 1 if r = r', 0 else # vergleichsfunktion
+    - then we can calculate the success rate SR as
+        - SR = (1/n)*sum(i=1, n, c(ri, r'i))
+
+### Precision and Recall for Binary Classification
+
+- Bsp. Suchmaschine -> Man will evaluieren ob das Modell gut funktioniert
+    - Für eine Suchanfrage wurde ein Test Set zusammengestellt
+    - Manuell bewertet (Gold Standard): 
+        - Man weiss für jeden Eintrag: Website passt oder passt nicht
+
+Bewertung:
+    
+|                     | positive gold        | negative gold       |
+| ------------------- | -------------------  | ------------------- |
+| positive classified | true positive (TP)   | false positive (FP) |
+| negative classified | false negatives (FN) | true negative (TN)  |
+
+- True Positives: Classifier bewertet positiv, Goldstandard sagt positiv    
+- True Negatives: Classifier sagt negativ und das stimmt auch
+- False Negatives: Classifier sagt nicht negativ, Goldstandard sagt aber positiv
+    - das ist ein Fehler
+    - Bsp. Suchmaschine: Die Suchmaschine liefert ein Suchresultat nicht zurück obwohl es relevant wäre
+- False Positives: Classifier sagt positive, das stimmt aber nicht
+    - das ist ein weiterer Fehler
+    - Bsp. Suchmaschine: Die Suchmaschine liefert ein nichtrelevantes Suchresultat zurück
+
+- Daraus abgeleitete Metriken:
+    - **Precision**
+        - Menge der TP in Bezug auf die insgesamt positiven Samples (gemäss Gold Standard)
+        - Wenn mein Algorithmus keinen Fehler macht dann hat man 100% precision
+        - P = TP / (Class p Classified)
+        - Bsp.: Wieviele der angezeigten Webseiten, sind gemäss Gold Standard wirklich relevant?
+    - **Recall**
+        - Wie hoch ist der Anteil der False Negatives gemäss Gold Standard
+        - R = TP / (Class p Gold)
+        - Bsp. Welche Seiten die der Mensch (Gold Standard) als relevant klassifiziert hat, werden tatsächlich angezeigt?
+            - Perfekt wenn all relevanten Seiten angezeigt wurden
+            - Schlecht wenn keine relevanten Seiten gefunden wurden
+
+
+
+
+
+
+
+
+
+