[AI] Classifying what's (un)interesting in a movie

We have just started gathering ideas for a Very Smart Camera (working title): a piece of software for a camera that could automatically determine whether what is has just captured is interesting for the user and should be saved to disk or whether it should rather be discarded.

I’m opening this thread as a support for brainstorming ideas on how to learn and classify interesting from not-interesting.

1 Like

I believe that we’re actually dealing with two different issues there, depending on the notion of (un)interesting and what we want to do with the camera.

1. Interesting vs. uninteresting

Example use case: recording wild animals.

This case is categorized by the fact that the user will be able to watch movies and classify manually interesting frames (e.g. “there is a lion”) vs. uninteresting frames (e.g. “it’s just the wind blowing in the trees”).

I suspect that we can learn to classify interesting/uninteresting with Deep Neural Networks. I also suspect that we don’t want to provide as input only the current frame, but a mega-image built from e.g. the last 5 seconds.

Any input from someone with experience in Deep Neural Networks (or other learning techniques) would be interesting.

2. Normal vs. odd

Example use case: detecting that someone has fallen in the stairs.

This case is categorized by the fact that we typically have a corpus of “normal” frames (e.g. people walking the stairs up or down, people looking at the ceiling) but no samples of “abnormal” frames (e.g. people falling).

It is very unclear to me how we can classify/learn to discover abnormal frames.

1 Like

@yoric, This is quite a good idea to use this in certain security cameras that can record only when people or vehicles come into the scene or possibly focus on an unusual event. It would save storage space.

One easy way would be to outsource the processing to any human in the frame :). AFAI’ve read elsewhere, detecting emotions in human faces is now rather readily doable. So, any video in which a human face shows a strong emotion can be clasified as interesting.

That could cause serious privacy issues, unfortunately – even if we take precautions to blur out faces. For instance, imagine seeing your bedroom with two obviously naked people, and you’re pretty sure that none of them is you.

Possibly. We can keep this as a heuristic, but I’d prefer something more generic.