Slide Detection Fails for Earlier Slides After Jump Ahead in Video
Issue Description
The current slide detection logic in detect_slide_transitions()
uses SlideTracker.current_slide_index
to ensure only forward slide transitions are recorded. This leads to missing intermediate slides if a later slide appears early in the video (e.g. due to skipping ahead during a lecture). As a result, all slides before that point are permanently ignored.
Steps to Reproduce
- Use the lecture video
dbwt1_01.mp4
and its corresponding slide deck. - Slide 65 is presented early in the lecture, directly after slide 11, with the lecture then continuing at slide 12. Although slides 12 through 64 are shown later in the video, they won't be detected.
Expected Behavior
- All slides shown in the video, regardless of their order in the PDF, should be detected if they appear in the video.
Actual Behavior
- Once a high-index slide is detected, all lower-index slides are skipped due to this check:
if most_similar_slide_index <= slide_tracker.current_slide_index:
return False
- This blocks any legitimate detection of slides that appear out of order.
Root Cause
The assumption was that slides always progress in strictly ascending PDF order does not hold true for all lectures. Professors may jump ahead or backtrack during the presentation.
Potential Solutions
1. Track Seen Slides Instead of Last Index
- Replace
SlideTracker.current_slide_index
with aSet[int] seen_slide_indices
. - Allow detection of any slide not yet seen (even if its index is lower than a previously detected one).
2. Add Configurable Mode
- Add a flag (e.g.
strict_order=True
) to optionally enforce forward-only detection.