Video Matting Benchmark

What’s new

21.03.20	We published evaluation results for FBA Matting [22].
30.01.20	We published evaluation results for Matting with Background Estimation [21].
08.12.17	We published evaluation results for Information-Flow Matting [20].
07.04.17	Subjective study results are now available.
26.12.16	We published evaluation results for Self-Adaptive Matting [17].
16.11.16	We published evaluation results for Deep Matting [16].
6.04.16	We published evaluation results for Sparse Sampling Matting [14].
15.12.15	1) New sequences with natural hairs including 3 public sequences. 2) New metrics of temporal coherency chosen by careful analysis (see [3]). 3) New trimap generation method (more natural-looking and accurate). 4) Better ground-truth quality owing to correction of lightning changes during capturing. 5) Improvement in website loading speed and interface.
7.09.15	We published the paper with our benchmark description [3].
30.12.14	We published results for multiple levels of trimaps; use drop-down menu at the top left corner to switch levels.
29.12.14	We added general ranking to the rating table.
10.11.14	“Sparse codes as Alpha Matte” was added.
26.09.14	Source sequences are available for online view now. Full screen mode was added.
30.08.14	Composite sequences are available now.
27.08.14	“Refine edge tool in Adobe After Effects” was added.
25.08.14	The official opening.

Overview

Introduction

The VideoMatting project is the first public objective benchmark for video-matting methods. It contains scatter plots and rating tables for different quality metrics. In addition, results for participating methods are available for viewing on a player equipped with a movable zoom region. We believe our work will help rank existing methods and aid developers of new methods in improving their results.

Datasets

The data set consists of five moving objects captured in front of a green plate and seven captured using the stop-motion procedure described below. We composed the objects over a set of background videos with various levels of 3D camera motion, color balance, and noise. We published ground-truth data for two stop-motion sequences and hid the rest to ensure fairness of the comparison.

Using thresholding and morphological operations on ground-truth alpha mattes, we generated narrow trimaps. Then, we dilated the results using graphcut-based energy minimization which provides us with more handmade-looking trimaps than common morphological dilation.

Chroma Keying

green screen

stop motion

Alpha mattes from chroma keying and stop-motion capture for the same image region. The stop-motion result is significantly better at preserving details.

Chroma keying is a common practice of the cinema industry: the cinematographer captures an actor in front of a green or blue screen, then the VFX expert replaces the background using special software. Our evaluation uses five green-screen video sequences with a significant amount of semitransparency (e.g., hair or motion blur), provided to us by Hollywood camera work. We extract alpha mattes and corresponding foregrounds using The Foundry Keylight. Chroma keying enables us to get alpha mattes of natural-looking objects with arbitrary motion. Nevertheless, this technique can’t guarantee that the alpha maps are natural, because it assumes the screen color is absent from the foreground object. To get alpha maps that have a very natural appearance, we use the stop-motion method.

Stop Motion

One-step capture over different backgrounds. We use checkerboard backgrounds instead of solid ones to eliminate screen reflection.

We designed the following procedure to perform stop-motion capture: A fuzzy toy is placed on the platform in front of an LCD monitor. The toy rotates in small, discrete steps along a predefined 3D trajectory, controlled by two servos connected to a computer. After each step the digital camera in front of the setup captures the motionless toy against a set of background images. At the end of this process, the toy is removed and the camera again captures all of the background images.

We paid special attention to avoiding reflections of the background screen in the foreground object. These reflections can lead to false transparency that is especially noticeable in nontransparent regions. To reduce the amount of reflection we used checkerboard background images instead of solid colors, thereby adjusting the mean color of the screen to be the same for each background.

At the end we corrected global lighting changes caused by light bulb flickering. Thus finally we obtain alpha mattes with less than 1% of noise level. The detailed description of ground-truth extraction methods is given in [3].

Evaluation Methodology

Our comparison includes both image- and video-matting methods. We apply each matting method to the videos in our data set, and then compare the results using the following metrics of per-pixel accuracy and temporal coherency (look into our paper [3] for comparison of different metrics):

Here denotes total number of pixels, and denote transparency values of video matting under consideration and ground truth correspondingly at pixel of frame , and denotes motion vector at pixel . We use optical-flow algorithm [11] computed for ground-truth sequences. It is worth noting that motion-aware metrics will not give unfair advantage to matting methods based on the similar motion estimation method since they do not have ground truth sequence. The detailed description of used quality metrics is given in [3].

Public Sequences

For the training purposes we publish here three test sequences with their ground-truth transparency maps. Developers and researchers are welcome to use these sequences, but we ask to cite us [3]

Castle

Alex

Dmitriy

Participate

We invite developers of video-matting methods to use our benchmark. We will evaluate the submitted data and report scores to the developer. In cases where the developer specifically grants permission, we will publish the results on our site. We can also publish anonymous scores for blind-reviewed papers. To participate, simply follow these steps:

Download the data set containing our sequences: City, Flowers, Concert, Rain, Snow, Vitaliy, Artem, Slava, Juneau, Woods,
Apply your method to each of our test cases
Upload the alpha and foreground sequences to any file-sharing service. We kindly ask you to maintain these naming and directory-structure conventions. If your method doesn't explicitly produce the foreground images you can skip uploading them; in this case, we will generate them using method proposed in [7].
Fill in this form to provide information about your method

Cite Us

To refer to our evaluation or test sequences in your work cite our paper [3].

@inproceedings{Erofeev2015,
	title={Perceptually Motivated Benchmark for Video Matting},
	author={Mikhail Erofeev and Yury Gitman and Dmitriy Vatolin and Alexey Fedorov and Jue Wang},
	year={2015},
	month={September},
	pages={99.1-99.12},
	articleno={99},
	numpages={12},
	booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
	publisher={BMVA Press},
	doi={10.5244/C.29.99},
	isbn={1-901725-53-7},
	url={https://dx.doi.org/10.5244/C.29.99}
}

Evaluation

Rating

Trimap size:

Trimap available for each frame
	year	rank	city	rain	concert	flowers	snow	Slava	Vitaliy	Artem	juneau	woods
Bayesian Matting [2]	2001	12.6	58.76¹⁴	51.66¹²	10.79⁸	144.53¹³	30.36¹³	53.28¹³	43.70¹³	70.50¹⁴	96.69¹²	140.12¹⁴
Robust Matting [12]	2007	8.2	37.54¹⁰	26.15⁴	9.61⁴	88.15⁷	18.52⁸	33.31¹⁰	34.15⁷	46.91¹¹	78.56¹¹	80.41¹⁰
Refine Edge [15]	2013	9.9	37.78¹¹	38.48¹⁰	8.64²	129.56¹¹	27.47¹²	31.36⁹	36.65¹¹	38.75⁸	102.02¹³	94.50¹²
Closed Form [7]	2008	7.9	31.34⁶	31.80⁷	10.96¹⁰	149.48¹⁴	15.86⁵	23.67⁷	30.15⁵	30.60⁵	69.94⁹	82.13¹¹
Learning Based [13]	2009	5.9	29.98⁴	33.08⁸	10.84⁹	134.82¹²	15.23⁴	23.35⁵	31.46⁶	30.20³	58.38⁴	56.60⁴
Nonlocal matting^s [5]	2011	14.7	54.21¹³	92.37¹⁵	33.86¹⁵	243.47¹⁶	45.13¹⁵	64.05¹⁵	61.75¹⁵	67.30¹³	135.08¹⁵	171.79¹⁵
Shared Matting [4]	2010	8.9	36.79⁹	46.72¹¹	10.30⁷	87.64⁶	24.41¹⁰	35.18¹¹	35.26⁹	45.23¹⁰	68.52⁸	67.17⁸
Comprehensive Sampling^s [10]	2008	7.9	34.48⁸	30.18⁶	12.10¹¹	79.60⁴	16.17⁶	25.02⁸	35.25⁸	41.05⁹	77.72¹⁰	70.18⁹
KNN Matting [1]	2012	13.9	67.83¹⁵	92.22¹⁴	64.73¹⁶	112.46¹⁰	32.96¹⁴	61.50¹⁴	50.87¹⁴	79.20¹⁵	113.41¹⁴	123.63¹³
Spectral Matting^s [6]	2012	15.5	86.32¹⁶	105.24¹⁶	13.15¹²	210.04¹⁵	48.76¹⁶	65.72¹⁶	68.83¹⁶	109.44¹⁶	157.18¹⁶	182.82¹⁶
Sparse Sampling^s [14]	2016	6.8	33.42⁷	33.50⁹	9.94⁵	81.06⁵	18.42⁷	23.64⁶	35.83¹⁰	38.00⁷	65.17⁶	64.12⁶
Deep Matting [16]	2016	4.6	27.33³	21.40³	20.33¹³	44.93²	19.23⁹	19.99³	30.00⁴	30.55⁴	46.18³	46.39²
Self-Adaptive [17]	2016	4.8	30.05⁵	29.30⁵	8.02¹	96.44⁹	14.32³	21.96⁴	29.92³	35.97⁶	66.05⁷	61.49⁵
Information-Flow [20]	2017	10.6	50.81¹²	59.62¹³	24.80¹⁴	94.81⁸	25.93¹¹	48.82¹²	42.14¹²	55.32¹²	64.02⁵	66.56⁷
Background Matting [21]	2020	2.5	22.77²	17.53²	10.06⁶	63.61³	10.80²	17.15²	25.19¹	22.27²	45.44²	46.85³
FBA Matting [22]	2020	1.3	17.25¹	16.69¹	8.76³	40.94¹	9.92¹	15.30¹	27.24²	19.10¹	28.67¹	30.45¹
Bayesian Matting [2]	2001	13.6	73.18¹⁴	67.92¹³	16.06¹¹	157.28¹⁴	39.59¹⁴	60.37¹³	56.05¹⁴	85.53¹⁵	115.88¹⁴	169.33¹⁴
Robust Matting [12]	2007	9.5	48.19¹¹	32.89⁷	14.72⁹	99.81⁷	25.96⁹	42.51¹⁰	38.88⁹	59.08¹¹	93.79¹¹	96.13¹¹
Refine Edge [15]	2013	9.6	42.23¹⁰	41.74¹⁰	10.45²	127.02¹¹	31.40¹²	35.08⁹	39.46¹⁰	42.91⁸	108.71¹²	97.76¹²
Closed Form [7]	2008	7.6	34.53⁶	35.86⁸	13.35⁶	154.12¹³	21.64⁸	27.19⁷	33.01⁵	35.45⁵	81.48⁸	92.75¹⁰
Learning Based [13]	2009	6.2	33.44⁵	37.63⁹	13.23⁵	138.31¹²	19.95⁷	26.51⁵	34.51⁶	34.70⁴	68.41⁵	63.53⁴
Nonlocal matting^s [5]	2011	14.5	55.04¹³	97.85¹⁴	31.81¹⁴	242.75¹⁶	46.94¹⁵	67.11¹⁵	65.09¹⁵	69.01¹³	144.66¹⁵	176.61¹⁵
Shared Matting [4]	2010	9.3	41.44⁹	56.75¹¹	14.14⁷	96.71⁶	27.66¹¹	42.92¹¹	40.05¹¹	58.17¹⁰	83.02⁹	78.91⁸
Comprehensive Sampling^s [10]	2008	6.9	38.43⁸	28.29⁴	14.45⁸	83.85⁴	16.84⁴	27.29⁸	37.07⁷	47.38⁹	89.96¹⁰	78.45⁷
KNN Matting [1]	2012	13.5	79.35¹⁵	98.71¹⁵	66.24¹⁶	112.81¹⁰	34.60¹³	64.17¹⁴	54.44¹²	84.38¹⁴	112.13¹³	128.86¹³
Spectral Matting^s [6]	2012	15.3	99.05¹⁶	125.29¹⁶	16.04¹⁰	214.42¹⁵	59.61¹⁶	74.55¹⁶	77.25¹⁶	122.89¹⁶	162.73¹⁶	188.16¹⁶
Sparse Sampling^s [14]	2016	5.2	33.20⁴	31.92⁵	12.34⁴	85.52⁵	17.66⁵	24.64⁴	37.78⁸	39.87⁶	71.05⁶	69.94⁵
Deep Matting [16]	2016	4.1	28.67³	21.10³	41.80¹⁵	44.81²	16.57³	21.04³	30.57³	30.94³	46.40³	50.27³
Self-Adaptive [17]	2016	6.1	34.71⁷	32.21⁶	11.02³	101.76⁹	18.61⁶	26.53⁶	31.23⁴	40.01⁷	75.29⁷	70.68⁶
Information-Flow [20]	2017	10.5	54.68¹²	67.36¹²	27.93¹³	100.44⁸	26.93¹⁰	55.24¹²	55.54¹³	60.15¹²	66.35⁴	79.33⁹
Background Matting [21]	2020	3.0	22.97²	18.15²	16.25¹²	68.75³	10.85²	17.08²	24.61¹	22.74²	43.41²	46.86²
FBA Matting [22]	2020	1.1	18.90¹	16.87¹	9.85¹	42.48¹	10.16¹	15.64¹	27.35²	19.85¹	28.80¹	30.95¹
Bayesian Matting [2]	2001	14.1	98.13¹⁵	95.95¹³	38.54¹⁴	170.76¹⁴	59.51¹⁵	80.72¹⁵	67.64¹³	122.08¹⁴	135.59¹⁴	179.25¹⁴
Robust Matting [12]	2007	10.7	67.71¹³	49.02⁹	19.87⁷	109.79⁹	44.10¹³	62.18¹¹	50.54¹⁰	81.49¹²	118.09¹²	115.96¹¹
Refine Edge [15]	2013	9.1	50.31⁷	51.81¹⁰	15.85⁴	125.25¹¹	40.73¹²	45.38⁹	46.43⁹	52.22⁷	118.88¹³	103.30⁹
Closed Form [7]	2008	7.8	45.04⁶	41.69⁶	24.19⁹	159.47¹³	30.09⁹	36.85⁸	39.20⁴	46.97⁵	104.07⁸	110.26¹⁰
Learning Based [13]	2009	6.4	44.97⁵	43.75⁷	21.90⁸	143.54¹²	26.50⁶	33.89⁶	40.44⁶	45.93⁴	90.35⁶	81.22⁴
Nonlocal matting^s [5]	2011	13.7	56.76¹⁰	113.90¹⁵	33.97¹³	241.31¹⁶	50.74¹⁴	77.14¹⁴	68.37¹⁴	76.43¹¹	159.08¹⁵	181.14¹⁵
Shared Matting [4]	2010	9.7	55.84⁹	79.09¹¹	19.78⁶	107.45⁸	32.94¹⁰	60.78¹⁰	51.47¹¹	122.33¹⁵	107.34⁹	97.22⁸
Comprehensive Sampling^s [10]	2008	6.9	56.79¹¹	37.36⁵	16.96⁵	87.64⁴	24.04⁵	34.32⁷	43.69⁷	64.63⁹	110.50¹⁰	93.35⁶
KNN Matting [1]	2012	12.7	95.89¹⁴	112.70¹⁴	66.12¹⁶	113.10¹⁰	37.18¹¹	71.26¹³	60.03¹²	98.42¹³	110.87¹¹	138.66¹³
Spectral Matting^s [6]	2012	15.3	116.87¹⁶	157.37¹⁶	26.12¹⁰	218.94¹⁵	76.22¹⁶	88.74¹⁶	91.69¹⁶	148.70¹⁶	169.33¹⁶	193.46¹⁶
Sparse Sampling^s [14]	2016	4.8	43.32⁴	34.66⁴	14.78³	89.79⁵	19.23⁴	29.92⁴	44.59⁸	51.94⁶	85.44⁵	81.77⁵
Deep Matting [16]	2016	4.0	31.29³	23.16²	53.02¹⁵	44.93²	18.20³	24.36³	32.04³	37.60³	50.48³	54.76³
Self-Adaptive [17]	2016	6.4	51.29⁸	46.56⁸	14.35²	106.71⁷	27.55⁷	31.98⁵	39.35⁵	54.80⁸	95.90⁷	95.73⁷
Information-Flow [20]	2017	10.3	66.86¹²	86.01¹²	30.99¹²	104.14⁶	29.25⁸	64.32¹²	82.92¹⁵	69.33¹⁰	69.71⁴	117.38¹²
Background Matting [21]	2020	3.0	25.77²	23.62³	29.33¹¹	75.90³	11.74²	18.31²	24.83¹	24.97²	43.92²	50.51²
FBA Matting [22]	2020	1.1	22.02¹	19.82¹	11.94¹	44.62¹	11.21¹	17.05¹	28.11²	22.14¹	29.79¹	33.06¹
Bayesian Matting [2]	2001	13.8	35.93¹⁴	65.24¹²	9.28¹⁰	104.68¹⁵	27.61¹³	35.04¹⁶	46.41¹³	60.34¹⁵	88.46¹⁵	121.91¹⁵
Robust Matting [12]	2007	8.2	22.32⁹	32.73⁴	7.26⁴	47.27¹³	16.04⁸	18.72¹⁰	34.47⁷	24.96⁸	53.42¹¹	42.40⁸
Refine Edge [15]	2013	7.2	14.41²	44.08⁹	6.30³	40.69⁸	18.36¹⁰	16.53⁶	34.93⁸	17.83⁴	53.43¹²	43.60¹⁰
Closed Form [7]	2008	6.2	16.37⁴	41.29⁷	7.67⁶	35.14⁷	14.59⁵	16.40⁵	32.43⁵	17.37³	51.40⁸	47.71¹²
Learning Based [13]	2009	5.8	17.29⁶	42.99⁸	7.88⁸	34.03⁵	14.44⁴	17.07⁸	34.37⁶	18.37⁵	46.02⁴	39.46⁴
Nonlocal matting^s [5]	2011	14.6	29.07¹³	110.76¹⁴	14.10¹⁵	120.01¹⁶	29.82¹⁵	32.69¹⁴	59.91¹⁵	36.22¹⁴	108.43¹⁶	96.82¹⁴
Shared Matting [4]	2010	9.2	22.72¹⁰	59.59¹¹	7.72⁷	45.92¹²	20.34¹¹	22.35¹¹	38.02¹¹	29.56¹¹	47.45⁵	38.19³
Comprehensive Sampling^s [10]	2008	8.9	22.78¹¹	39.97⁶	13.83¹⁴	33.11⁴	15.61⁶	17.51⁹	37.09⁹	25.72⁹	53.33¹⁰	45.04¹¹
KNN Matting [1]	2012	13.7	36.16¹⁵	115.08¹⁵	25.58¹⁶	44.09¹⁰	27.80¹⁴	31.61¹³	56.08¹⁴	34.80¹³	87.53¹⁴	75.36¹³
Spectral Matting^s [6]	2012	13.9	65.50¹⁶	131.74¹⁶	4.78¹	103.61¹⁴	39.03¹⁶	34.39¹⁵	63.21¹⁶	64.40¹⁶	87.44¹³	122.53¹⁶
Sparse Sampling^s [14]	2016	8.9	21.78⁸	44.85¹⁰	9.86¹¹	34.07⁶	16.94⁹	16.55⁷	37.40¹⁰	25.84¹⁰	51.77⁹	42.98⁹
Deep Matting [16]	2016	4.7	16.79⁵	25.93³	13.12¹³	26.74²	15.64⁷	14.58³	31.85³	19.25⁶	45.03³	37.27²
Self-Adaptive [17]	2016	5.8	18.76⁷	38.89⁵	7.37⁵	45.75¹¹	13.36³	15.44⁴	32.26⁴	20.12⁷	49.15⁶	40.48⁶
Information-Flow [20]	2017	10.8	28.00¹²	78.55¹³	10.64¹²	41.13⁹	22.56¹²	28.71¹²	42.89¹²	29.75¹²	49.89⁷	41.44⁷
Background Matting [21]	2020	2.4	15.00³	21.44²	6.25²	32.74³	10.53²	13.40²	27.79¹	16.29²	42.39²	40.10⁵
FBA Matting [22]	2020	1.9	12.65¹	19.55¹	8.79⁹	23.72¹	10.30¹	12.32¹	28.52²	14.82¹	29.27¹	28.06¹
Bayesian Matting [2]	2001	14.2	46.96¹⁵	77.27¹²	11.32¹⁰	111.60¹⁵	33.85¹⁵	39.54¹⁶	56.86¹²	75.50¹⁶	102.70¹⁵	151.60¹⁶
Robust Matting [12]	2007	9.3	26.36¹⁰	37.73⁵	8.35⁶	52.07¹³	19.18⁹	21.45¹⁰	37.08⁸	28.55¹⁰	57.78¹²	47.95¹⁰
Refine Edge [15]	2013	6.4	15.24³	47.35¹⁰	6.58²	40.54⁸	19.40¹⁰	16.77⁵	35.74⁶	18.24⁴	54.11¹⁰	43.43⁶
Closed Form [7]	2008	6.8	17.25⁵	42.80⁸	7.11⁴	35.79⁷	16.70⁸	17.10⁷	33.94⁵	18.08³	52.82⁹	48.94¹²
Learning Based [13]	2009	5.9	18.97⁶	45.14⁹	7.45⁵	34.62³	16.32⁷	18.09⁸	36.77⁷	19.50⁶	48.00⁴	39.16⁴
Nonlocal matting^s [5]	2011	14.1	29.67¹²	114.94¹⁴	14.84¹³	121.99¹⁶	31.03¹⁴	33.66¹⁴	61.34¹⁵	37.04¹³	117.02¹⁶	93.18¹⁴
Shared Matting [4]	2010	9.8	24.21⁹	68.25¹¹	9.00⁸	49.95¹²	21.88¹¹	24.25¹¹	42.74¹¹	37.14¹⁴	51.50⁶	39.51⁵
Comprehensive Sampling^s [10]	2008	8.8	26.80¹¹	35.36⁴	15.59¹⁴	34.80⁵	15.73⁴	18.29⁹	39.17¹⁰	27.81⁹	55.93¹¹	48.73¹¹
KNN Matting [1]	2012	13.2	38.85¹⁴	120.08¹⁵	26.54¹⁶	44.01⁹	28.25¹³	31.83¹³	57.58¹³	35.78¹²	85.37¹⁴	75.34¹³
Spectral Matting^s [6]	2012	13.7	72.45¹⁶	155.21¹⁶	5.03¹	104.95¹⁴	44.50¹⁶	35.84¹⁵	64.96¹⁶	66.66¹⁵	85.23¹³	130.32¹⁵
Sparse Sampling^s [14]	2016	7.2	23.66⁸	39.99⁶	11.54¹¹	35.17⁶	16.17⁶	16.48⁴	38.93⁹	27.46⁸	51.93⁷	44.55⁷
Deep Matting [16]	2016	4.3	17.21⁴	24.74³	21.03¹⁵	27.68²	14.33³	14.98³	31.41³	19.05⁵	45.10³	36.96²
Self-Adaptive [17]	2016	7.0	20.61⁷	40.20⁷	8.93⁷	46.73¹¹	15.80⁵	17.03⁶	32.95⁴	21.69⁷	52.15⁸	44.90⁸
Information-Flow [20]	2017	11.1	34.10¹³	86.71¹³	13.95¹²	45.99¹⁰	23.26¹²	31.00¹²	61.24¹⁴	30.42¹¹	49.38⁵	44.99⁹
Background Matting [21]	2020	2.3	14.92²	22.50²	6.91³	34.67⁴	10.69²	13.21²	27.25¹	16.16²	40.04²	38.86³
FBA Matting [22]	2020	1.9	13.67¹	19.98¹	9.62⁹	24.32¹	10.47¹	12.41¹	28.71²	15.11¹	29.15¹	28.29¹
Bayesian Matting [2]	2001	15.2	68.32¹⁵	110.78¹³	29.70¹⁶	119.54¹⁵	47.66¹⁵	52.24¹⁶	67.60¹⁵	118.01¹⁶	98.34¹⁵	148.17¹⁶
Robust Matting [12]	2007	9.8	32.72¹¹	53.12⁸	9.85⁵	56.83¹³	25.40¹²	27.41¹⁰	42.73⁸	36.49¹¹	62.55¹²	52.27⁸
Refine Edge [15]	2013	5.8	15.82³	57.57¹⁰	6.58²	40.25⁸	20.67⁹	18.04⁴	37.30⁶	19.11³	55.24⁷	44.18⁶
Closed Form [7]	2008	5.9	18.75⁵	48.30⁶	7.76³	36.78⁶	19.57⁷	18.21⁶	35.50⁴	19.15⁴	57.13⁹	53.95⁹
Learning Based [13]	2009	5.6	21.35⁶	50.91⁷	7.84⁴	35.59³	18.92⁵	19.37⁸	39.75⁷	21.31⁶	51.66⁵	41.99⁵
Nonlocal matting^s [5]	2011	13.7	31.25¹⁰	129.13¹⁴	17.96¹²	123.49¹⁶	31.74¹⁴	36.13¹⁵	62.32¹³	38.82¹³	127.06¹⁶	91.37¹⁴
Shared Matting [4]	2010	10.2	27.95⁹	93.58¹¹	10.60⁹	55.32¹²	25.04¹¹	27.72¹¹	48.28¹¹	65.06¹⁴	58.63¹⁰	41.34⁴
Comprehensive Sampling^s [10]	2008	8.8	34.62¹²	42.66⁵	16.53¹¹	35.69⁴	19.23⁶	21.26⁹	44.33¹⁰	33.28¹⁰	60.01¹¹	55.61¹⁰
KNN Matting [1]	2012	12.8	42.71¹⁴	130.14¹⁵	26.50¹⁵	43.95⁹	29.19¹³	33.06¹²	59.22¹²	38.04¹²	82.47¹³	76.16¹³
Spectral Matting^s [6]	2012	13.5	81.14¹⁶	183.76¹⁶	5.49¹	108.67¹⁴	54.31¹⁶	35.81¹⁴	67.31¹⁴	70.45¹⁵	82.74¹⁴	128.41¹⁵
Sparse Sampling^s [14]	2016	6.8	27.39⁸	41.55⁴	12.15¹⁰	36.95⁷	17.04⁴	18.19⁵	43.94⁹	32.67⁸	54.81⁶	49.26⁷
Deep Matting [16]	2016	4.1	18.72⁴	28.24²	23.65¹⁴	28.18²	15.36³	16.66³	31.80³	20.98⁵	47.83³	36.95²
Self-Adaptive [17]	2016	7.9	25.66⁷	56.94⁹	10.23⁶	49.29¹⁰	19.97⁸	18.71⁷	37.00⁵	25.31⁷	55.89⁸	57.11¹²
Information-Flow [20]	2017	11.2	41.09¹³	106.61¹²	18.03¹³	49.30¹¹	24.59¹⁰	35.49¹³	94.92¹⁶	32.86⁹	49.22⁴	56.92¹¹
Background Matting [21]	2020	2.9	15.57²	28.72³	10.58⁸	36.50⁵	11.33¹	13.56²	27.23¹	16.71²	39.75²	38.14³
FBA Matting [22]	2020	1.8	14.43¹	23.68¹	10.55⁷	24.98¹	11.49²	12.93¹	29.29²	15.85¹	29.64¹	29.22¹
Bayesian Matting [2]	2001	13.8	1.85¹⁴	3.58¹²	0.12¹¹	13.33¹⁴	0.74¹⁴	2.31¹⁶	1.47¹³	4.51¹⁵	8.01¹⁴	16.23¹⁵
Robust Matting [12]	2007	9.5	0.89¹¹	0.94⁴	0.07⁴	4.82¹³	0.25⁹	0.88¹⁰	0.77⁸	1.58¹²	3.89¹²	3.92¹²
Refine Edge [15]	2013	8.0	0.55⁶	1.48¹⁰	0.06²	3.99¹⁰	0.34¹⁰	0.54⁹	0.70⁷	0.80⁶	3.80¹¹	3.16⁹
Closed Form [7]	2008	5.7	0.49³	1.35⁸	0.09⁸	2.88⁴	0.19⁵	0.35⁶	0.61³	0.61⁴	2.75⁶	3.38¹⁰
Learning Based [13]	2009	5.6	0.50⁵	1.42⁹	0.09⁹	2.88⁵	0.18⁴	0.34⁵	0.68⁶	0.64⁵	2.37⁴	2.43⁴
Nonlocal matting^s [5]	2011	14.9	1.56¹³	8.17¹⁵	0.70¹⁵	20.97¹⁶	1.11¹⁵	1.88¹⁵	2.51¹⁶	2.35¹⁴	11.17¹⁶	13.37¹⁴
Shared Matting [4]	2010	10.1	0.79¹⁰	2.73¹¹	0.09⁷	4.50¹²	0.35¹¹	1.00¹²	0.91¹¹	1.43¹¹	2.94⁸	2.94⁸
Comprehensive Sampling^s [10]	2008	8.8	0.74⁹	1.06⁵	0.19¹²	3.15⁷	0.22⁷	0.39⁸	0.86¹⁰	1.24⁹	3.65¹⁰	3.40¹¹
KNN Matting [1]	2012	13.2	2.28¹⁵	6.40¹⁴	1.22¹⁶	3.47⁸	0.61¹³	1.02¹³	1.73¹⁴	1.94¹³	6.42¹³	7.73¹³
Spectral Matting^s [6]	2012	14.5	7.03¹⁶	13.25¹⁶	0.08⁶	15.36¹⁵	2.09¹⁶	1.84¹⁴	2.45¹⁵	6.59¹⁶	8.71¹⁵	19.97¹⁶
Sparse Sampling^s [14]	2016	7.9	0.66⁸	1.30⁷	0.11¹⁰	3.09⁶	0.25⁸	0.38⁷	0.86⁹	1.04⁸	2.94⁹	2.76⁷
Deep Matting [16]	2016	4.6	0.50⁴	0.48³	0.32¹⁴	1.15²	0.22⁶	0.25³	0.63⁵	0.57³	1.84³	1.84³
Self-Adaptive [17]	2016	5.6	0.58⁷	1.15⁶	0.06¹	4.43¹¹	0.16³	0.33⁴	0.62⁴	0.87⁷	2.91⁷	2.66⁶
Information-Flow [20]	2017	10.2	1.10¹²	3.94¹³	0.27¹³	3.90⁹	0.41¹²	0.97¹¹	1.10¹²	1.38¹⁰	2.46⁵	2.61⁵
Background Matting [21]	2020	2.3	0.38²	0.32²	0.08⁵	2.11³	0.10²	0.20²	0.43¹	0.35²	1.62²	1.82²
FBA Matting [22]	2020	1.3	0.23¹	0.30¹	0.07³	0.92¹	0.09¹	0.15¹	0.50²	0.26¹	0.77¹	0.87¹
Bayesian Matting [2]	2001	14.6	3.12¹⁵	5.07¹³	0.22¹¹	16.65¹⁵	1.34¹⁵	3.03¹⁶	2.50¹³	11.60¹⁶	18.03¹⁶	25.63¹⁶
Robust Matting [12]	2007	10.7	1.34¹¹	1.26⁶	0.15⁷	6.19¹³	0.50¹²	1.42¹²	0.99¹⁰	2.38¹²	5.21¹²	5.44¹²
Refine Edge [15]	2013	7.0	0.66⁵	1.62¹⁰	0.08¹	3.95⁹	0.42⁹	0.63⁹	0.76⁶	0.92⁶	4.05¹⁰	3.16⁵
Closed Form [7]	2008	6.1	0.61⁴	1.46⁸	0.11⁴	3.24⁵	0.27⁸	0.45⁷	0.69⁵	0.80⁴	3.29⁶	3.93¹⁰
Learning Based [13]	2009	5.5	0.66⁶	1.59⁹	0.11⁵	3.22⁴	0.26⁶	0.42⁴	0.78⁷	0.84⁵	2.96⁵	2.87⁴
Nonlocal matting^s [5]	2011	14.3	1.62¹³	8.93¹⁵	0.78¹⁴	21.62¹⁶	1.22¹⁴	2.06¹⁴	2.77¹⁵	2.50¹³	13.52¹⁵	12.30¹⁴
Shared Matting [4]	2010	10.8	1.03⁹	3.51¹¹	0.17⁹	5.77¹²	0.48¹¹	1.51¹³	1.62¹¹	2.57¹⁴	4.04⁹	3.78⁹
Comprehensive Sampling^s [10]	2008	8.9	1.09¹⁰	0.95⁴	0.26¹²	3.63⁸	0.27⁷	0.48⁸	0.98⁸	1.71¹⁰	4.65¹¹	4.42¹¹
KNN Matting [1]	2012	12.3	3.05¹⁴	6.83¹⁴	1.32¹⁶	3.49⁶	0.70¹³	1.11¹¹	1.96¹²	2.28¹¹	6.30¹³	8.55¹³
Spectral Matting^s [6]	2012	14.0	9.36¹⁶	20.54¹⁶	0.11³	16.55¹⁴	3.15¹⁶	2.36¹⁵	2.91¹⁶	7.89¹⁵	8.87¹⁴	22.44¹⁵
Sparse Sampling^s [14]	2016	7.2	0.79⁸	1.18⁵	0.18¹⁰	3.52⁷	0.26⁵	0.44⁶	0.98⁹	1.29⁸	3.39⁷	3.36⁷
Deep Matting [16]	2016	4.1	0.56³	0.47³	1.05¹⁵	1.18²	0.19³	0.27³	0.67³	0.57³	1.90³	1.99³
Self-Adaptive [17]	2016	6.7	0.76⁷	1.34⁷	0.12⁶	4.94¹¹	0.24⁴	0.43⁵	0.69⁴	1.12⁷	3.66⁸	3.62⁸
Information-Flow [20]	2017	10.0	1.62¹²	4.61¹²	0.42¹³	4.72¹⁰	0.46¹⁰	1.10¹⁰	2.60¹⁴	1.54⁹	2.56⁴	3.19⁶
Background Matting [21]	2020	2.6	0.39²	0.37²	0.16⁸	2.52³	0.11²	0.20²	0.42¹	0.35²	1.51²	1.85²
FBA Matting [22]	2020	1.2	0.28¹	0.31¹	0.09²	0.97¹	0.09¹	0.16¹	0.51²	0.28¹	0.78¹	0.92¹
Bayesian Matting [2]	2001	15.4	6.71¹⁵	11.09¹⁴	3.97¹⁶	19.88¹⁵	3.00¹⁵	5.18¹⁶	4.08¹⁵	37.45¹⁶	16.50¹⁶	25.08¹⁶
Robust Matting [12]	2007	11.7	2.40¹²	2.54¹⁰	0.23⁷	7.65¹³	1.18¹³	2.70¹⁴	1.57¹⁰	4.16¹³	7.26¹³	7.15¹²
Refine Edge [15]	2013	5.7	0.81⁴	2.18⁸	0.12¹	3.91⁷	0.54⁸	0.84⁹	0.90⁵	1.14⁴	4.51⁷	3.39⁴
Closed Form [7]	2008	6.1	0.88⁵	1.70⁶	0.24⁸	3.64⁶	0.46⁶	0.72⁷	0.85⁴	1.20⁵	4.36⁶	5.21⁸
Learning Based [13]	2009	5.5	1.03⁶	1.88⁷	0.21⁵	3.62⁵	0.43⁵	0.61⁴	1.00⁷	1.32⁶	4.27⁵	3.74⁵
Nonlocal matting^s [5]	2011	13.2	1.76¹⁰	11.85¹⁵	1.10¹³	22.25¹⁶	1.29¹⁴	2.46¹²	3.00¹³	2.88¹⁰	16.36¹⁵	12.20¹⁴
Shared Matting [4]	2010	10.8	1.74⁹	6.02¹¹	0.27⁹	7.46¹²	0.85¹¹	2.70¹³	2.26¹¹	7.65¹⁴	6.35¹¹	5.05⁷
Comprehensive Sampling^s [10]	2008	9.1	2.28¹¹	1.55⁵	0.32¹⁰	3.97⁸	0.54⁷	0.73⁸	1.35⁸	3.01¹¹	6.59¹²	6.11¹¹
KNN Matting [1]	2012	11.4	4.24¹⁴	7.84¹³	1.32¹⁴	3.52⁴	0.88¹²	1.38¹⁰	2.34¹²	3.04¹²	6.17¹⁰	9.87¹³
Spectral Matting^s [6]	2012	13.8	13.17¹⁶	31.38¹⁶	0.18³	18.16¹⁴	5.33¹⁶	2.78¹⁵	3.74¹⁴	10.24¹⁵	9.19¹⁴	22.92¹⁵
Sparse Sampling^s [14]	2016	6.7	1.37⁷	1.32⁴	0.23⁶	4.02⁹	0.35⁴	0.65⁵	1.43⁹	2.19⁹	4.67⁸	4.66⁶
Deep Matting [16]	2016	4.0	0.64³	0.60²	1.34¹⁵	1.22²	0.22³	0.37³	0.75³	0.78³	2.27³	2.21³
Self-Adaptive [17]	2016	8.0	1.42⁸	2.49⁹	0.19⁴	5.48¹¹	0.55⁹	0.66⁶	0.99⁶	1.94⁸	5.20⁹	6.04¹⁰
Information-Flow [20]	2017	10.4	2.67¹³	6.25¹²	0.71¹²	5.32¹⁰	0.57¹⁰	1.54¹¹	5.32¹⁶	1.92⁷	2.77⁴	5.83⁹
Background Matting [21]	2020	3.0	0.47²	0.65³	0.49¹¹	2.98³	0.12²	0.22²	0.43¹	0.41²	1.56²	2.02²
FBA Matting [22]	2020	1.2	0.35¹	0.43¹	0.14²	1.05¹	0.12¹	0.19¹	0.53²	0.34¹	0.82¹	1.06¹

city
rain
concert
flowers
snow
Slava
Vitaliy
Artem
juneau
woods

Source
Trimap
BM [2]
RM [12]
RE [15]
CF [7]
LB [13]
NlM [5]
ShM [4]
CS [10]
KNN [1]
SpM [6]
SpSM [14]
DM [16]
SAM [17]
IFM [20]
MWBE [21]
FBAM [22]

0 %

Note: Make sure you are using the latest version of your web browser (we recommend to use chromium-based web browsers)

Integral Plots

Subjective comparison

We carried out subjective comparison of 13 matting methods using Subjectify.us platform. We applied matting methods to videos from our dataset and then uploaded videos containing extracted foreground objects and ground-truth sequences to Subjectify.us. The platform hired study participants and showed them these videos in pairwise fashion. For each pair, participants were asked to choose the video with better visual quality or indicate that they are approximately equal. Each study participant compared 30 pairs including 4 hidden quality-control comparisons between ground truth and a low-quality method; answers of 23 participants were rejected, since they failed at least one quality-control question. In total 10556 answers from 406 participants were collected. Bradley-Terry [18] and Crowd Bradley-Terry [19] models were used to convert pairwise comparisons to subjective ranks. The study report generated by the platform is shown below.

Multidimensional Analysis

References

[1]	Qifeng Chen, Dingzeyu Li, and Chi-Keung Tang. KNN matting. Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 35(9):2175–2188, 2013. [ doi , project page ]
[2]	Yung-Yu Chuang, Brian Curless, David H. Salesin, and Richard Szeliski. A bayesian approach to digital matting. In Computer Vision and Pattern Recognition (CVPR), volume 2, pages II-264–II-271, 2001. [ doi , project page , code ]
[3]	Mikhail Erofeev, Yury Gitman, Dmitriy Vatolin, Alexey Fedorov, Jue Wang. Perceptually Motivated Benchmark for Video Matting. British Machine Vision Conference (BMVC), pages 99.1–99.12, 2015. [ doi , pdf , project page ]
[4]	Eduardo S.L. Gastal and Manuel M. Oliveira. Shared sampling for real-time alpha matting. Computer Graphics Forum, 29(2):575–584, 2010. [ project page ]
[5]	Philip Lee and Ying Wu. Nonlocal matting. In Computer Vision and Pattern Recognition (CVPR), pages 2193–2200, 2011. [ code ]
[6]	A. Levin, A. Rav Acha, and D. Lischinski. Spectral matting. Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 30(10):1699–1712, 2008. [ doi , project page ]
[7]	Anat Levin, Dani Lischinski, and Yair Weiss. A closed-form solution to natural image matting. Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 30(2):228–242, 2008. [ doi , code ]
[8]	Christoph Rhemann, Carsten Rother, Jue Wang, Margrit Gelautz, Pushmeet Kohli, and Pamela Rott. A perceptually motivated online benchmark for image matting. In Computer Vision and Pattern Recognition (CVPR), pages 1826–1833, 2009. [ doi ]
[9]	E. Shahrian and D. Rajan. Weighted color and texture sample selection for image matting. In Computer Vision and Pattern Recognition (CVPR), pages 718–725, 2012. [ doi , code ]
[10]	E. Shahrian, D. Rajan, B. Price, and S. Cohen. Improving image matting using comprehensive sampling sets. In Computer Vision and Pattern Recognition (CVPR), pages 636–643, 2013. [ doi , code ]
[11]	Karen Symonyan, Sergey Grishin, Dmitriy Vatolin, and Dmitriy Popov. Fast video superresolution via classification. International Conference on Image Processing (ICIP), pages 349–352, 2008. [ doi ]
[12]	Jue Wang and Michael F. Cohen. Optimized color sampling for robust matting. In Computer Vision and Pattern Recognition (CVPR), pages 1–8, 2007. [ doi , project page ]
[13]	Yuanjie Zheng and C. Kambhamettu. Learning based digital matting. In International Conference on Computer Vision (ICCV), pages 889–896, 2009. [ doi , code ]
[14]	Levent Karacan, Aykut Erdem, Erkut Erdem. Alpha Matting with KL-Divergence Based Sparse Sampling, IEEE Transactions on Image Processing, 2017.
[15]	http://www.adobe.com/en/products/aftereffects.html, Refine Edge tool in Adobe After Effects CC.
[16]	Ning Xu, Brian Price, Scott Cohen, and Thomas Huang. Deep Matting. In Computer Vision and Pattern Recognition (CVPR), 2017
[17]	Guangying Cao, Jianwei Li, Xiaowu Chen, Zhiqiang He. Patch-based self-adaptive matting for high-resolution image and video, The Visual Computer, 1-15, 2017.
[18]	Ralph Allan Bradley and Milton E Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324–345, 1952.
[19]	Chen, Xi, et al. Pairwise ranking aggregation in a crowdsourced setting. Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 2013.
[20]	Yagiz Aksoy, Tunc Ozan Aydin and Marc Pollefeys. Designing Effective Inter-Pixel Information Flow for Natural Image Matting. In Computer Vision and Pattern Recognition (CVPR), 2017. [ doi , code ]
[21]	Matting with Background Estimation: A Novel Method for Extracting Clean Foreground, IEEE Transaction on Image Processing 2020 (anonymous submission)
[22]	F, B, Alpha Matting. Anonymous ECCV 2020 submission #6826