Item Infomation

Full metadata record
DC FieldValueLanguage
dc.creatorSiddharth, Narayanaswamy-
dc.creatorBarbu, Andrei-
dc.creatorSiskind, Jeffrey Mark-
dc.date2015-12-10T17:56:01Z-
dc.date2015-12-10T17:56:01Z-
dc.date2014-05-29-
dc.date.accessioned2023-04-13T10:00:02Z-
dc.date.available2023-04-13T10:00:02Z-
dc.identifierhttp://hdl.handle.net/1721.1/100169-
dc.identifierarXiv:1308.4189v2-
dc.identifier.urihttp://lib.yhn.edu.vn/handle/YHN/716-
dc.descriptionWe present a system that demonstrates how the compositional structure of events, in concert with the compositional structure of language, can interplay with the underlying focusing mechanisms in video action recognition, thereby providing a medium, not only for top-down and bottom-up integration, but also for multi-modal integration between vision and language. We show how the roles played by participants (nouns), their characteristics (adjectives), the actions performed (verbs), the manner of such actions (adverbs), and changing spatial relations between participants (prepositions) in the form of whole sentential descriptions mediated by a grammar, guides the activity-recognition process. Further, the utility and expressiveness of our framework is demonstrated by performing three separate tasks in the domain of multi-activity videos: sentence-guided focus of attention, generation of sentential descriptions of video, and query-based video search, simply by leveraging the framework in different manners.-
dc.descriptionThis research was supported, in part, by ARL, under Cooperative Agreement Number W911NF-10-2-0060, and the Center for Brains, Minds and Machines, funded by NSF STC award CCF-1231216.-
dc.formatapplication/pdf-
dc.languageen_US-
dc.publisherCenter for Brains, Minds and Machines (CBMM), arXiv-
dc.relationCBMM Memo Series;006-
dc.rightsAttribution-NonCommercial 3.0 United States-
dc.rightshttp://creativecommons.org/licenses/by-nc/3.0/us/-
dc.subjectComputer vision-
dc.subjectMachine Learning-
dc.subjectComputer Language-
dc.titleSeeing What You’re Told: Sentence-Guided Activity Recognition In Video-
dc.typeTechnical Report-
dc.typeWorking Paper-
dc.typeOther-
Appears in CollectionsTài liệu ngoại văn

Files in This Item:
Thumbnail
  • CBMM-Memo-006.pdf
      Restricted Access
    • Size : 1,23 MB

    • Format : Adobe PDF