« View All Resources

Grounding Object Recognition and Scene Understanding (MIT Fall2011)

Computer Vision Central - Posted on December 22, 2011 at 3:29 pm.

  • Links: http://people.csail.mit.edu/torralba/courses/6.870_2011f/6.870.grounding.html
  • Details:
  • This class will cover current approaches to object recognition and scene understanding in computer vision and its relation to other disciplines. The goal of this class is to provide an in depth presentation of computer vision techniques for recognition of objects, scenes, materials, actions, ... but by putting them in the framework of concrete tasks.

    The class is addressed to students from any discipline, not just vision, interested in learning about computer vision techniques that can be applied to their research. We will cover state of the art object recognition and scene understanding techniques and how they relate to robotics, language, computer graphics, crowd sourcing, human-computer interaction, etc. For students in computer vision, this class will allow exploring new tasks and scene representations, beyond labeling objects in images for the sake of it.

    The course will cover bag of words models, part based models, classifier based models, multiclass object recognition and transfer learning, concurrent recognition and segmentation, context models for object recognition, representations for scene understanding and large datasets for semi supervised and unsupervised discovery of object and scene categories, etc. We will be reading a mixture of papers from computer vision and influential works from cognitive psychology and other disciplines.


    Schedule

    DateTopicLectureInvited speaker

    Slides/videos

    Links to Papers/code
     Introduction    
    Sept. 7Class goals and a short introductionAntonio 

    Lecture1 (ppt)

    -P. Cavanagh, Vision is getting easier every day, Perception 1996
    Sept. 14Edges, textures, ...Antonio Lecture2 (ppt)




    Sept. 21The importance of dataAntonio

    Boris Katz

    Student: 
    Carl Vondrick

    Lecture3 (ppt)
    Boris (ppt)
    Carl (ppt)

    -LabelMe (websitepaper.pdf)
    -Watson (paper.pdf)
    -START (system websitepaper.pdf)
    -Video annotation 
    -Visipedia 

    Sept. 28Object recognitionAntonio

    Seth Teller

    Student:
    David Hayden

    Lecture4 (ppt)

    -Felzenszwalb, McAllester and Ramanan. A Discriminatively Trained, Multiscale, Deformable Part Model. CVPR 2008. (code)
    - Manipulation (paper.pdf
    - Natural language commands (paper.pdf


    Oct. 5Object recognition in contextAntonio

    Nicholas Roy

    Student:
    Ryan Schoen

    Lecture5 (ppt)

    tellex11.pdf
    hri10-tk.pdf
    icra09-tk.pdf

    Oct. 12Human visionAntonio

    Aude Oliva

    Student:
    Deborah Hanus

    Lecture6 (ppt)

    - A. Oliva, A. Torralba. Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 2001. (gist code)
    - S. Lazebnik, C. Schmid, and J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. CVPR 2006. (code)



    Oct. 19Words and picturesAntonio

    Regina Barzilay

    Student:
    Yevgeni Berzak

    Lecture7 (ppt)

    Gestural Cohesion for Discourse Segmentation
    Jacob Eisenstein, Regina Barzilay, Randall Davis
    Proceedings of ACL, 2008

    Modeling Gesture Salience as a Hidden Variable for Coreference Resolution and Keyframe Extraction
    Jacob Eisenstein, Regina Barzilay, Randall Davis
    Journal of Artificial Intelligence Research, 2008

    Turning Lectures into Comic Books with Linguistically Salient Gestures
    Jacob Eisenstein, Regina Barzilay, Randall Davis
    Proceedings of AAAI, 2007


    Oct. 26Multiclass models and transfer learningAntonio

    Daniela Rus

    Student:
    Sudeep Pillai

    Lecture8 (ppt)




    Nov. 2No class   




    Nov. 9No classICCV  




    Nov. 16Vision and the brainAntonio

    Jim Di Carlo

    Students:
    Ha Hong

    Lecture9 (ppt)

    Jim Di Carlo's papers


    Nov. 23HCIAntonioStudents:
    Mike Fleder
    Jeremy Scott
    Yafim Landa 
    Lecture10 (ppt)




    Nov. 303D scenesAntonioStudents:
    Emily Zhao
    Xiaodan Jia
    Lecture11 (ppt)




     Projects    
    Dec. 7Project presentationsAntonio  




    Dec. 14ProjectsAntonioLast day of classes 




    Resources

    Related courses:

    Workshops:

    Tutorials:

    Other resources:

    Datasets

    Code

    Here there are links to useful code for low-level and mid-level vision tasks:

    Other useful code:


    k();} ?>