![]() |
Maemo and Computational Photography
Future maemo devices are in a very unique position. Currently the N900 has a high resolution camera, a fast processor, plenty of storage, and a very open environment. With all of these characteristics, this is a perfect platform on which to experiment with Computational Photography.
Some of the simpler effects may come in the form of photoshop-like filters. These can be simple filters like watermarks, motion blur, edge-detection, hue/contrast, etc, etc. However, this is just the tip of the iceberg.... Imagine face-detection, augmented-reality using fiduciary markers, object isolation and recognition, in-photo optical character recognition, automatic HDRI splicing, 2D to 3D imaging, automatic panorama composites, image finger-printing, etc. And then consider using these in combination with one another. Not only can the improve the quality of images taken, but provide shots that would be impossible with a traditional digi-cam alone. The best part? It's all software. It would be nice to have a framework that could accept processing plug-ins, be built apon, and be used in apps as a dynamic library or from the command line. This implementation would make it easy to use from an application developer perspective, and very easy to contribute to (plugin-submission rather than patch). If you have examples of CP in action, post them below! It's a growing field and I'd be interested to hear your input! PS. Here's something that could use such a framework, though in this case, not for photography: http://www.youtube.com/watch?v=zyWVH6jkDHg }:^)~ |
Re: Maemo and Computational Photography
Here's another neat idea, guessing device-orientation based on outdoor-shadow direction, the accelerometer, and time of day. If could be a very useful function in the absence of a compass. Perhaps not real-time, but still very useful in conjunction with GPS!
Additionally, a library like this may be useful for robots that use the N900 at its core, to process visual information. Of course, the plugins would have to be fast enough to be used in real-time! }:^)~ |
Re: Maemo and Computational Photography
Ok, last one for now!
Here's a VERY useful use: photo-copy. Imagine taking a snapshot of a page, and having the library calculate the correct orientation, and compensating for page colour and lighting across the page, and then possibly running it through an ocr. Despite the page being bent (think thick book), or the lighting being different across the page (also thick book), or a slightly skewed angle (quick snapshot), the pages true orientation and colour could be found, as though it was a single, flat, colour-photocopied page. This would be tremendously useful for snapping notes, business cards, reference pages, magazine pages, handouts, brochures, etc. Combine them with a tool like xournal and an upload engine, and you have TRUE magic. Of course, something like this could be contained in an app, but it would be better to contain the functionality in a library, and then build the app using the library, so that many other developers can use the CP framework in weird, wild ways. }:^)~ |
Re: Maemo and Computational Photography
As I said in the other thread, at the summit we heard that it is planned to open the camera app - we should certainly either then, of in the meantime try to sort out a plug-in architecture so that photos can be post-processed in various ways.
In fact the camera app is probably easy enough to replace, other than the jpeg encoding on the DSP, which might take some work, so we could always get cracking with it now in anticipation. |
Re: Maemo and Computational Photography
This is something we are working on. Full GStreamer support in N900 allows you to have your own processing elements in Camera pipelines (there are three of them, viewfinder, photography and video recording). Post-processing with different effects for Image Viewer application is planned for Maemo 6, see Maemo Image Editor project on maemo.gitorious.org for our current code under development.
I also have been discussing the very same topic with people from Stanford University (Frankencamera) and others in community (Elphel, for example), and internally in Nokia. We are pretty much open to contributions on the topic. One thing you would really need to understan is how all additional live processing is related to latencies and memory bandwidth that it would introduce. I'm not even touching the CPU load, it is secondary issue. Any additional frame copying while in viewfinder or recording video pipelines cuts down effective frames per second rate and contributes negatively to smoothness of the UX. Any additions need to be verified closely for such side effects. |
Re: Maemo and Computational Photography
I would like to be able to photograph a vinyl record (remember them?), and have the song come out of the N900's speakers.
This is very complex but achieveable, as the Digital Needle project showed. As you can hear from their sound files, it was a proof of concept only. Regards, Roger |
Re: Maemo and Computational Photography
To throw some oil in the fire: It will be probably possible one to use photoshop.com on N900 when the accelerated Flash 10.1 comes around :) Its more like the Lightroom, but nevertheless...
|
Re: Maemo and Computational Photography
yesterday i tried a little piece of software (demo) which would be awesome on a N900:
http://people.cs.ubc.ca/~mbrown/auto...utostitch.html I can allready see myself creating 4-desktop wallpapers directly on my N900... :p thanks for moving interest on this subject, Capt :) |
Re: Maemo and Computational Photography
Maybe this thread needs to be merged with the HDR discussion thread?
Mike C |
Re: Maemo and Computational Photography
Great! I'm glad to see enthusiasm for this project. I believe not only can it be a big plus for Maemo and the N900, but a useful project for open source in general. I'm sure if the framework is robust enough, we can solicit the contributions of academia, or tinkerers world-wide.
Ok, enough of that. Any ideas? This is all new-territory for me, so forgive me if my suggestions sound silly. Input plugins - For the N900, images can be taken from the camera, but having a generic input plugin makes this useful for other devices, and/or getting images from different sources (fs, internet, etc). Consider that the appropriate input plugin could handle vector imagery as well! Output plugins - This is pretty straight forward. Having an extensible output system, the output could be xml commands for an autonomous bot. The output should not be limited to a graphic format. Also output to the internet becomes possible. Temporary filesystem storage - For processing large amounts of data, or storing data for later use. It would be nice if this was an automatic extension of the memory allocation and access system. The benefit of not doing this is swap is a) a much larger scratch pad, b) the potential to optimize reads and writes based upon operations, c) one less thing to think of as the developer (and a pretty big one!). Node-based architecture - This is a BIG feature, and IMO a must-have. We all know that pipes are useful on the command line. It would be nice to provide an in-code mechanism to easily chain plugins together. PLEASE Check out www.filterforge.com to get a good idea of the possibilities! Resolution Independence - This seems a bit odd at first, but being able to automatically interpolate lower res images to make them size compatible for comparisons with higher res images would be useful for comparisons. Language independence - This is a bit of an odd one, but it would be VERY nice to be able to program these plugins in any language. It would be doubly nice to have the library itself easily portable to any number of languages as well. I'm not sure how easy/possible this would be, so feel free to add or strike me down as a raving heretic. CP (computational photograpy) functions - This is the bread and butter of the application. These CP functions take 0-n resolution independent images, 0-n data inputs, and compute 1-n outputs. XML rule sheet - This file stores the node structure, which the library can use to load plugins, and determines how to pass information around. This is useful for using the library in a non-interactive, or code-driven way. This would also be necessary for the command-line app. Plugin controller - It would be nice to be able to break the execution of a plugin after a completed stage, so that the app can prevent runaway situations for complex computations that would take hours on an N900, but would be well suited for a desktop. Perhaps the plugin architecture can allow for this? Some extra thoughts: - I think there should be designation for real-time vs slow plugins. Perhaps a different interface for each. - I think that this should be standards based and avoid re-inventing the wheel where necessary - I think that it should be an incredibly modular system to allow for disjoined development and easy integration into projects. - We need a common interface for each category of plugin. Anybody care to add anything? Have any ideas for a name? Any and all input is much appreciated! }:^)~ |
Re: Maemo and Computational Photography
Cap, many of the things you list are exactly what GStreamer does.
PS. My personal yet-to-be custom cam project is cam2book, where you could take quick snaps of a publication and end up with a cbr or pdf file which would then be easy to view <plug>for example with pyqtoreader</plug>, and also easy to share since it is a single file. |
Re: Maemo and Computational Photography
Quote:
I like the sound of your project! It would be so useful to be able to use the phone as document capturing device! I can't wait to see how it turns out! }:^)~ |
Re: Maemo and Computational Photography
Quote:
My current pet project is to transfer (backup) camera raws to a phone via Eye-Fi / Joikuspot to allow camera storage increase for photo shoots during holidays. Basic photo storage, develop and editing via ufraw, with direct upload to Flickr/Picasa/photoshop.com and direct print seems to me like a killer portable computer application. |
Re: Maemo and Computational Photography
eiffel, I could imagine taking a photo of the sleeve of a CD/Album and using the 'tineye.com' API(?) to match the photo with the album.
|
Re: Maemo and Computational Photography
I'm currently working on a project where I use an embedded Linux PC to control a remote digital camera. This runs gphoto2, ntpd and an ftp sever.
It would be easy to do this on the N900, except it doesn't have a tripod mount! |
Re: Maemo and Computational Photography
Does anyone know if it is possible to capture a "raw" image, or perhaps a .tif? Currently I believe the output is only .jpg, and they are processed. I could see times where it would be nice to capture all the details the imager would allow and post process somewhere else, if desired.
|
Re: Maemo and Computational Photography
Hey thorbo,
This is potentially one use of the framework that's being discussed in this very thread! Of course we're in the brainstorming phase, so it's not available just yet. ;) }:^)~ |
Re: Maemo and Computational Photography
How about if you take a video and turn all the way around, keeping the horizon "in the frame". Then the computational photography application turns this into a 3200x480 wrap-around wallpaper for the device.
|
Re: Maemo and Computational Photography
Indeed these are some great ideas, and this is all possible! Even moreso if there's a framework that individuals could easily build applications against. It's better than re-inventing the wheel specifically for the car so-to-speak.
The node-based (or graph-based) implementation ensures easy 'stitching' of plugins for combination functionality. For example, taking a series of images and compiling them into a panorama, then spitting it out as a cut-up series of 4 800x480 pictures, and writing them to a specific maemo directory. With the right plugins, a talented developer could build such an app in less than a week: the majority of the work would be handled via the framework! Keep the ideas coming! Incidentally, if you have any ideas for data-structures, algorithms, implementation ideas, etc, now is the time to contribute your thoughts! I'd like to start compiling a list of ideas and algorithms and eventually create a wiki. I see this potentially being a HUGE component for maemo and future maemo devices. We are in a position to set the trend for the industry. Young and old people alike would enjoy using their cameras in more creative ways! }:^)~ |
Re: Maemo and Computational Photography
I'm not sure it qualifies as computational photography but I still vote for a "translate street signs" type of deal. Firstly because I like the idea, secondly because AR is hot right now and thirdly because I *think* it is at least semi-doable...
Maybe the "computational photography" part would be text recognition. Ideally from a live a videofeed but statically from a photo would be a fine start in my book. What I mean with text recognition is software which takes an image containing text (for instance a street sign), identifies the text as such and processes it to a point where it could be fed to an OCR program such as ocrfeeder, That would let you capture text from a photo (or ideally a live camera feed) to do with what you like. My personal favourite is to send it to google translate and overlay the translated text on the photo (or to be able to call it AR, on that live videofeed). I posted the suggestion on Joaquim Rocha's blogpost regarding the ocrfeeder port and he basically said he was already thinking along those lines, so he might be interested in having more people working on it..? |
Re: Maemo and Computational Photography
Quote:
Or photograph a face and push to the contacts database, again simple crop and resize. How about photograph a business card, character recognition and create contact Mike C |
Re: Maemo and Computational Photography
Quote:
http://www.joaquimrocha.com/2009/08/...-in-fremantle/ (scroll down to the comments) Like the other two ideas though. |
Re: Maemo and Computational Photography
Games and sports lend themselves to computational photography.
- Photograph a scrabble board with your pieces in front of it, and the N900 works out the best move - Photograph a chess board after each move, and the N900 produces a standard-format text file documenting the game - Photograph a bridge hand and the N900 displays the heuristics that you normally work out in your head (high card points, number of losers, etc) - Sports training: video yourself running and the N900 overlays a golden line showing your gait movement (by tracing out the movement of the knees, feet, hips, and arms) - Party game: photograph each participant, then the N900 blends pairs of faces. The participant who can guess which two people made up each photo, wins! - Movement counter, to count any kind of repetitive movement (e.g. number of gym workout moves). Just start the video camera, do one sample move, and let the N900 keep track of how many more you do. - Accuracy detector. Touch the target area on the N900 screen, then shoot your arrows (or throw your darts, or whatever) and the N900 accumulates stats on how close you got. And some non-game ideas: - Weather statistics: photograph the sky, and get a figure for the percentage cloud cover. (Can you photograph the sun without burning out the camera sensor?) - Height measurer: photograph someone in front of any object of known height, and the N900 tells you how tall that person is. Actually this can be a general "measuring" application. - When you've forgotten the name of someone who you've met before, just surreptitiously photograph them. The N900 matches them against your previous photos, and displays their name. - For journalists: photograph a scene, and the N900 tells you how many people are in the scene. I was always amused at the London anti-war rallies, that the ratio between the crowd size as estimated by the press and the police was at least 10:1, and often much more. - Photograph a windsock to get a readout of the wind speed; photograph a weathervane to get a readout of the wind direction - Car parking assistant. Your passenger gets out and points the video camera at the parking space. A voice synthesizer on the N900 says "back a bit, straighten out, forward, slightly right, stop". - Product locator. Don't you just hate it when your supermarket rearranges the shelves and you can't find the olives anymore? Just photograph your old jar of olives at home, then walk up and down the aisles at the supermarket until the camera beeps to say that it has found the olives. Regards, Roger |
Re: Maemo and Computational Photography
@eiffel
Wow... just wow! }:^o~ |
Re: Maemo and Computational Photography
A lot of the applications that you guys describe are in fact not considered computational photography. For example, recognizing street signs or the pieces on a chess board are computer vision applications, not computational photography. Computational photography refers to image acquisition methods which combine camera(s) and computation, the output is a photograph (in contrast to computer vision, where a computer tries to understand an image and for example outputs the positions of pieces on a chess board).
I have some neat ideas for computational photography on the N900 but I'm not yet sure if they are feasible (the N900 does not have a lot of computing power when talking about state of the art image processing). For a computer vision application, it would be cool to have an application that recognizes URLs in images and makes them "clickable". For example, if you have a magazine with a URL in it you can you take a picture and open the URL in the browser. It would be even better to do this for video in real time, not just images. |
Re: Maemo and Computational Photography
There is an app called fotoxx that should be portable to the N900 that has a lot of nifty effects including HDR and panorama / autostitch.
|
Re: Maemo and Computational Photography
@pinish,
Excellent point. I think that all of these ideas including the computer-vision ideas could still fit well into an all incompassing framework. I have begun to spec some data-structures, as well as do some general research. I don't have a ton of time, however, so it'll be a slow process at best. However, hearing potential applications helps in determining weather the spec I have in mind will fit the bill. Great idea, by the by! Keep em' coming! }:^)~ |
Re: Maemo and Computational Photography
@pinsh: I accept your point about the terminology.
An interesting computational photography application would be to capture "photo-finish" images like this one: http://www.sportingworld.co.uk/cgi-b...009photofinish These are not regular photographs, because each vertical column of pixels is taken at a different time. You can produce them like this:
Regards, Roger |
Re: Maemo and Computational Photography
Quote:
|
Re: Maemo and Computational Photography
I'd be excited to have some computational photography and computer vision on the N900.
here are some more ideas: * photo-watchdog: many industrial cameras can be configured to read out a subset of their sensor with much higher framerate. If that's possible on the N900 camera, an application could watch a selected region (e.g. finish line) for a change with very high fps and take a picture or start video recording as soon as something changes. * Camera-shake image deblurring: might be to computationally demanding for the CPU but the EXIF could at least contain all acceleration sensor data for later postprocessing * more intelligent Autofocus modi (like professional DSLRs) with software AF points * online Face, cow and car detection (CVs love that! ;-) * fast object localisation: specify some target images (e.g. one sock) and get alerted when and where something matches, see e.g. http://www.kyb.mpg.de/publications/a...MI_[0].pdf * live-preview as desktop background? for the user experience: if some algorithm is too slow, it should subsample the video stream and indicate the fps in the preview. |
Re: Maemo and Computational Photography
Quote:
http://talk.maemo.org/showthread.php?p=357615 *** I am quite excited about doing AR and other computer vision and (image processing in general) applications with NITs, but I became a little suspicious lately that we don't have the tools to take full advantage of the processor. How do I create an application using OpenCV, OpenMAX or any other libraries, making sure I am exploring DSP stuff? (I'm not sure N8x0 offers much, but N900 should have that NEON thing, right?) For example, I am implementing this algorithm called MonoSLAM. One of the tasks I need to perform is to match features in sub-regions of images captured from the camera. I'm not sure if the processor would bear it, but my first try would be to match using normalized cross correlation, and for that I need to calculate DFTs of image patches, and calculating image integrals would be nice also. Are there libraries available to do this kind of low-level image processing to me? Can I even implement them by myself somehow? Can I use GCC, ou do I need some sort of proprietary compiler? I remember some time ago people were talking about making an OGG decoding library that made use of the DSP resources. A Nokia developer, if I am not mistaken, said it probably wouldn't pay off. How come?? I want to see a "demo" of the OMAP DSP... I want to run two programs, one compiled with and other without the "magical library" that make signal-processing tasks faster. And if it's not much better, I want a different processor, without useless DSP things that we can't/don't need to explore. There are few things I hate more than a processor with unused instructions. I am very concerned about all this because only the other day I discovered, after a long time working with OpenCV just for prototypes, that you need some proprietary Intel library to have a really neat OpenCV implementation. No SIMD for the "free" crowd. I think that's kinda lame... Now I'm trying to figure out how things are happening with OMAP. Is it the same? To make full use of SIMD I need some proprietary compiler or library? If that is the case, I'll end up going to program for my Chinese Z80 music player. At least I get the feeling I am using all the hardware available... :( Sorry for the long rant, I am not sure I should start another thread since some many concerned people are around here anyway. :rolleyes: I am not hijacking the thread. I am just asking: what are the tools we should be using to make the best image processing applications possible for the NITs? How should i implement my feature matching algorithm? How should I implement my own demosaicing algorithm (supposing we can get the really-raw data from the camera)? I want to make an audio synthesizer too... How do I make it as efficient as possible? |
Re: Maemo and Computational Photography
I think you are mixing things together for no real benefit. NEON is part of ARM core, ARM is capable to run both NEON and ARM pipelines in parallel in some cases. DSP core is a separate part, has different assembly and generally speaking you can have devices where DSP is not present but NEON and VFP are supported, like BeagleBoard.
Look at NEON optimizations Siarhei Siamashka did for pixman: http://cgit.freedesktop.org/~siamash...rm-neon-update in his development tree. NEON code can be written by hands but with fairly recent gcc autovectorization on -O3 level could produce you NEON code on ARM. The latter is not always of good quality and generally needs more love from compiler folks to be useful. Siarhei's code makes good use of aggressive prefetching to allow both ARM and NEON be busy. DSP programming in OMAP chipsets is a different story. You need a proprietory tools for it. But once you've done your algorithms and compiled them, you can use recently opened Maemo's gst-dsp to hook up algorithms into gstreamer pipelines. It works faster and more reliable than original TI's library for DSP communication. See http://maemo.gitorious.org/maemo-multimedia/gst-dsp for more details in the code. There is also OpenMAX bridge for gstreamer: http://maemo.gitorious.org/maemo-multimedia/gst-openmax which has been in use for some time but also needs more love before becoming really useful. gst-dsp and NEON optimizations approaches worked better. |
Re: Maemo and Computational Photography
Thanks so much! I have only recently heard about NEON, and I am still understanding where it sits. :) I'll sure take a look at that pixman code.
The DSP in OMAP is only a TI thing, right? I need a proprietary compiler to use it? Are there at least open specifications of how it works, or do we need to reverse-engineer the thing to at least play with it a bit? |
Re: Maemo and Computational Photography
Yes, it is TI's product. Read http://focus.ti.com/lit/ug/spru732h/spru732h.pdf and related documents if you want to get understanding how to work with it. Tools are available on TI's website. http://software-dl.ti.com/dsps/dsps_...index_FDS.html for example.
|
Re: Maemo and Computational Photography
In case someone is interested, a computational photography course using N900s has just started at Stanford
http://www-graphics.stanford.edu/courses/cs448a-10/ Hartti |
Re: Maemo and Computational Photography
Maybe you already know this, but here it is: http://qtpfsgui.sourceforge.net/
a open source program to create hdr images. Would it be possible to use it directly on the N900? (note: I didn't test it yet) |
Re: Maemo and Computational Photography
I have been using enfuse in Easy Debian to create HDR images on my device. It does a decent job, takes about 6 minutes to produce a final picture. My main problem is that the script by davost in the HDR thread to take three "bracketed" photos doesn't focus correctly.
|
Re: Maemo and Computational Photography
This was always an interesting thread, so I figured I'd might imform those tracking that F-Camera, was released for the n900, though it currently has some install issues. It is the results of the Frankencamera group at Nokia Research and Stanford.
|
Re: Maemo and Computational Photography
Take close-up photos of a record and have it find the grooves, reconstruct the wave form and play the music. (a pocket record player!)
|
All times are GMT. The time now is 22:17. |
vBulletin® Version 3.8.8