Micro-GPS って何？（論文からの引用）

Micro-GPSの詳細は論文自体を読んでいただくとして、ここでは、その中から３つの項目を引用してみます。ここから大体こういうものなのねとイメージできると思います。

I. INTRODUCTION

III. SYSTEM-A. Mapping

V. APPLICATION: AUTOMATIC PATH FOLLOWING

表題は

High-Precision Localization Using Ground Texture
地面のテクスチャを使用した高精度の自己位置推定

です。

で、なんでこういうシステムを考えたのかは、ここでは引用されていませんが、セクションII. RELATED WORKに書かれています。

具体的には？は、セクションIII. SYSTEMのB以降。

どの程度のもんなの？は、セクションIV. EVALUATIONを参照。

でもなんか未だ足りてないよね…はセクションVI. DISCUSSIONを読んでみてください。

I. INTRODUCTION

The Global Positioning System (GPS) receiver has become an essential component of both hand-held mobile devices and vehicles of all types.
全地球測位システム（GPS）受信機は、携帯型のモバイル機器やあらゆる種類の車両に不可欠なコンポーネントとなっています。

Applications of GPS, however, are constrained by a number of known limitations.
ただし、GPSのアプリケーションは、いくつかの既知の制限によって制約されます。

A GPS receiver must have access to unobstructed lines of sight to a minimum of four satellites, and obscured satellites significantly jeopardize localization quality.
GPS受信機は、最低でも4機の衛星に対して遮るもののない見通しを確保しなければならず、衛星が見えないと測位品質が著しく低下します。

Indoors, a GPS receiver either is slow to obtain a fix, or more likely does not work at all.
屋内では、GPS受信機は修正を取得するのに時間がかかるか、まったく機能しない可能性があります。

Even outdoors, under optimal circumstances, accuracy is limited to a few meters (or perhaps a meter with modern SBAS systems).
屋外でも、最適な状況下では、精度は数メートル（または最新のSBASシステムでは1メートル）に制限されます。

These limitations make GPS insufficient for fine-positioning applications such as guiding a car to a precise location in a parking lot, or guiding a robot within an indoor room or warehouse.
これらの制限により、GPSは、駐車場の正確な場所に車を誘導したり、屋内の部屋や倉庫内でロボットを誘導したりするなど、細かい位置に配置するアプリケーションには不十分です。

To overcome the robustness and accuracy limitations of GPS, alternative localization technologies have been proposed, which are either less accurate than GPS (e.g., triangulation of cellphone towers and WiFi hotspots), or expensive and cumbersome to deploy (e.g., RFID localization or special-purpose sensors embedded in the environment).
GPSのロバスト性と精度の限界を克服するために、代替となる位置推定技術が提案されていますが、これらはGPSよりも精度が低い（例：携帯電話の基地局やWiFiホットスポットの三角測量）か、または高価で導入が面倒です（例：RFID位置推定や環境に埋め込まれた特別な目的のセンサー）。

Inertial navigation and odometry, which are often used in robotics for fine-positioning tasks, require a known initial position, drift over time, and lose track (requiring manual re-initialization) when the device is powered off.
慣性航法とオドメトリは、ロボット工学で微調整タスクによく使用されますが、既知の初期位置を必要とし、時間の経過とともにドリフトし、デバイスの電源がオフになるとトラックを失います（手動での再初期化が必要）。

This paper proposes a system that provides millimeterscale localization, both indoors and outside on land.
この論文は、陸上の屋内と屋外の両方でミリメートルスケールの位置推定を提供するシステムを提案します。

The key observation behind our approach is that seemingly-random ground textures exhibit distinctive features that, in combination, provide a means for unique identification.
私たちのアプローチの背景にある重要な観察は、一見ランダムに見える地面のテクスチャには特徴があり、それが組み合わさることで、固有の識別手段となるということです。

Even apparently homogeneous surfaces contain small imperfectionscracks,scratches, or even a particular arrangement of carpet fibers – that are persistently identifiable as local features.
一見、均一に見える表面にも、小さな傷やひっかき傷、あるいはカーペットの繊維の並びなどがあり、それらは局所的な特徴として持続的に識別されます。

While a single feature is not likely to be unique over a large area, the spatial relationship among a group of such features in a small region is likely to be distinctive, at least up to the uncertainty achievable with coarse localization methods such as GPS or WiFi triangulation.
1つの特徴が広い範囲で固有のものである可能性は低いですが、小さな地域におけるそのような特徴のグループの空間的関係は、少なくともGPSやWiFi三角測量などの粗い位置推定方法で達成する不確実さの内では、特徴的である可能性があります。

Inspired by this observation,we construct a system called Micro-GPS that includes a downward-facing camera to capture fine-scale ground textures, and an image processing unit capable of locating that texture patch in a pre-constructed compact database within a few hundred milliseconds.
この観察に着想を得て、Micro-GPSと呼ばれるシステムを構築します。このシステムには、下向きのカメラで微細な地面のテクスチャをキャプチャし、そのテクスチャパッチを事前に構築されたコンパクトなデータベースに数百ミリ秒以内に配置できる画像処理ユニットが含まれています。

The use of image features for precise localization has a rich history, including works such as Photo Tourism [1] and Computational Re-Photography [2].
画像の特徴を利用して正確な位置推定する方法には、Photo Tourism [1]やComputational Re-Photography [2]などの歴史があります。

Thus, a main contribution of our work is determining how some of the algorithms used for feature detection and matching in “natural” images,as used by previous work, can be adapted for “texture-like”images of the ground.
したがって、私たちの仕事の主な貢献は、前の仕事で使用された「自然な」画像の特徴検出とマッチングに使用されるアルゴリズムのいくつかを、地面の「テクスチャのような」画像にどのように適合させることができるかを決定することです。

In searching for a robust combination of such methods, we exploit two key advantages of groundtexture images.
このような方法の堅牢な組み合わせを探す際に、地面のテクスチャ画像の2つの重要な利点を活用します。

First, the ground can be photographed from much closer range than typical features in the environment,leading to an order-of-magnitude improvement in precision.
まず、地面は、環境内の一般的な特徴よりもはるかに近い距離から写真を撮ることができ、精度が桁違いに向上します。

Second, the statistics of texture-like images lead to a greater density of features, leading to greater robustness over time.
2つ目は、テクスチャのような画像の統計的性質により、特徴量の密度が高くなり、時間経過に対するロバスト性が高くなることです。

図1：システムの概要
私たちのテストロボットは、Point Grey社製単眼カメラを制御するNVIDIA JetsonTX1開発ボードを備えています。
カメラの周りの遮光板とLEDのリングは、制御された照明を提供します。
まず、前処理としてデータベースイメージをキャプチャし、それらをグローバルに一貫性のあるマップにつなぎ合わせます。
その後、ロボットの位置を特定するために、クエリ画像から特徴量を抽出し、その特徴量がマップ上の潜在的な画像の姿勢に投票します。
投票マップのピークがインライアの特徴を決定し、そこからクエリ画像の姿勢を復元します。

Our system consists of two phases: an offline database construction phase, and an online localization phase (Figure 1).
私たちのシステムは、オフラインデータベース構築フェーズとオンライン位置推定フェーズの2つのフェーズで構成されています（図1）。

We begin by collecting ground texture images and aligning them using global pose optimization.
まず、地面のテクスチャ画像を収集し、グローバルな姿勢最適化を用いて位置合わせを行います。

We extract local features (keypoints) and store them in a database, which is subsequently compressed to a manageable size.
局所的な特徴（キーポイント）を抽出してデータベースに格納し、その後、扱いやすいサイズに圧縮します。

For localization, we find keypoints in a query image and search the database for candidate matches using approximate nearest neighbor matching.
位置推定では、クエリ画像の中からキーポイントを見つけ、近似最近傍照合を用いてデータベースを検索し、マッチする候補を探します。

Because it is common for more than 90% of the matches to be spurious, we use voting to reject outliers, based on the observation that inlier matches will vote for a consistent location whereas outliers distribute their votes randomly.
90%以上のマッチが偽物であることはよくあることなので、誤対応を排除するために投票を使用します。これは、誤対応のマッチがランダムに投票を分配するのに対し、正対応のマッチは一貫した場所に投票するという観察に基づいています。

Finally, we use the remaining inlier matches to precisely calculate the location of the query image.
最後に、残りの正対応のマッチを使って、クエリ画像の位置を正確に計算します。

The major contributions of this paper are:
この論文の主な貢献は次のとおりです。

• Describing a low-cost global localization system based on ground textures and making relevant code and in structions available for reproduction.
地面のテクスチャに基づいて低コストのグローバル位置推定システムを記述し、関連するコードと命令を複製できるようにします。

• Capturing and making available datasets of seven indoor and outdoor ground textures.
7つの屋内および屋外の地面テクスチャのデータセットをキャプチャして利用できるようにします。

• Investigating the design decisions necessary for practical matching in texture-like images, as opposed to natural images.
自然な画像とは異なり、テクスチャのような画像で実用的なマッチングを行うために必要な設計上の判断を調べます。

This includes the choice of descriptor,strategies for reducing storage costs, and a robust voting procedure that can find inliers with high reliability.
これには、記述子の選択、ストレージコストを削減するための戦略、信頼性の高い正対応を見つけることができるロバストな投票手順などが含まれます。

• Demonstrating a real-world application of precise localization: a robot that uses Micro-GPS to record a path and then follow it with sub-centimeter accuracy.
正確な位置推定の実際のアプリケーションのデモンストレーション：Micro-GPSを使用して経路を記録し、それを1センチメートル未満の精度で追跡するロボット。

The ability to localize a vehicle or robot precisely has the potential for far-reaching applications.
車両またはロボットを正確に位置推定する機能は、広範囲にわたるアプリケーションの可能性を秘めています。

A car could accurately park (or guide the driver to do so) in any location it recognizes from before, avoiding obstacles mere centimeters away.
車は、わずか数センチ離れた障害物を避けて、以前から認識している任意の場所に正確に駐車する（またはそうするようにドライバーを案内する）ことができます。

A continuously-updated map of potholes could be used to guide drivers to turn slightly to avoid them.
継続的に更新される道路の穴の地図を使用して、ドライバーが穴を回避するためにわずかに曲がるように誘導することができます。

The technology applies equally well to vehicles smaller than cars,such as Segways, electric wheelchairs, and mobility scooters for the elderly or disabled, any of which could be guided to precise locations or around hard-to-see obstacles.
この技術は、セグウェイ、電動車椅子、高齢者や障害者向けのモビリティスクーターなど、車よりも小さい車両にも同様に適用できます。これらの車両はいずれも、正確な場所や見えにくい障害物の周囲に誘導できます。

Indoor applications include guidance of warehouse robots and precise control over assistive robotics in the home.
屋内アプリケーションには、倉庫ロボットのガイダンスと家庭内の支援ロボットの正確な制御が含まれます。

III. SYSTEM

A. Mapping

Hardware Setup and Data Collection: Our imaging system consists of a Point Grey CM3 grayscale camera pointed downwards at the ground (Figure 1, left).
ハードウェアのセットアップとデータ収集：撮像システムは、Point Grey（現 FLIR）社製のグレースケールカメラCM3を地面の方に向けて設置しています（図1左）。

A shield blocks ambient light around the portion of the ground imaged by the camera, and a set of LED lights arranged symmetrically around the lens provides rotation-invariant illumination.
シールドは、カメラによって画像化された地面の部分の周りの周囲光を遮断し、レンズの周りに対称的に配置されたLEDライトのセットは、回転不変の照明を提供します。

The distance from the camera to the ground is set to 260 mm for most types of textures we have experimented with.
カメラから地面までの距離は、私たちが実験したほとんどの種類のテクスチャで260mmに設定されています。

Our system is insensitive to this distance, as long as a sufficient number of features can be detected.
我々のシステムは、十分な数の特徴が検出される限り、この距離に影響されません。

The camera output is processed by an NVIDIA Jetson TX1 development board.
カメラの出力は、NVIDIA Jetson TX1開発ボードで処理されます。

Our prototype has the camera and development board mounted on a mobile cart, which may be moved manually or can be driven with a pair of computer-controlled motorized wheels.
私たちのプロトタイプでは、カメラと開発ボードがモバイルカートに取り付けられており、手動で移動することも、コンピューター制御の電動ホイールで駆動することもできます。

The latter capability is used for the “automatic path following” demonstration described in Section V.
後者の機能は、セクションVで説明されている「自動経路追跡」のデモンストレーションに使用されます。

For initial data capture, however, we manually move the cart in a zig-zag path to ensure that an area can be fully covered.
しかし、最初のデータ取得の際には、人力でカートをジグザグに移動させ、エリアを完全にカバーできるようにしています。

This process, while is easily mastered by non-experts, could be automated by putting more engineering effort or even through crowd-sourcing when there are more users.
このプロセスは、専門家でなくても簡単にマスターできるものですが、技術的な努力を重ねることで自動化することもできますし、ユーザー数が多ければクラウドソーシングを利用することもできます。

Image Stitching: To construct a global map, we assume the that surface is locally planar, which is true even for most outdoor surfaces.
画像のステッチング：グローバルマップを作成するために、サーフェスが局所的に平面であると仮定します。これは、ほとんどの屋外サーフェスにも当てはまります。

Our image stitching pipeline consists of frame-to-frame matching followed by global optimization,leveraging extensive loop closures provided by the zig-zag path.
私たちの画像スティッチングパイプラインは、フレーム間のマッチングとそれに続くグローバルな最適化で構成され、ジグザグ経路によって提供される広範なループクロージャを活用します。

Since the computation becomes significantly more expensive as the area grows, we split a large area into several regions (which we reconstruct separately) and link the already-reconstructed regions.
面積が大きくなると計算コストが非常に高くなるため、大きな面積をいくつかの領域に分割して（別々に再構成して）、すでに再構成された領域をリンクさせています。

This allows us to quickly map larger areas with decent quality.
これにより、より広いエリアを適切な品質ですばやくマッピングできます。

Figure 1, right shows the“asphalt” dataset, which covers 19.76m2 in high detail.
図1の右側は、19.76m2を詳細にカバーする「アスファルト」データセットを示しています。

Datasets: We have experimented with a variety of both indoor and outdoor datasets, covering ground types ranging from ordered (carpet) to highly stochastic (granite), and including both the presence (concrete) and absence (asphalt) of visible large-scale variation.
データセット：秩序のある面（カーペット）から非常に確率的な面（花崗岩）まで、また、目に見える大規模な変化がある場合（コンクリート）とない場合（アスファルト）を含む、屋内と屋外のさまざまなデータセットを使って実験を行いました。

We have also captured test images for the datasets on a different day (to allow perturbations to the ground surfaces) to evaluate our system.
また、このシステムを評価するために、データセットのテスト画像を別の日に撮影しました（地表面に摂動を与えるため）。

図2：データセットから切り取ったテクスチャパッチの例
屋外と屋内のテクスチャは、それぞれ青と赤でマークされています。

Figure 2 shows example patches from our dataset.
図2は、データセットのパッチの例を示しています。

We will make these datasets, together with databases of SIFT features and testimage sequences, available to the research community.
これらのデータセットを、SIFT特徴およびテスト画像シーケンスのデータベースとともに、研究コミュニティで利用できるようにします。

Database Construction: The final stage in building a map is extracting a set of features from the images and constructing a data structure for efficiently locating them.
データベースの構築：マップ構築の最終段階は、画像から一連の特徴を抽出し、それらを効率的に見つけるためのデータ構造を構築することです。

This step involves some key decisions, which we evaluate in Section IV.
このステップには、セクションIVで評価するいくつかの重要な決定が含まれます。

Here we only describe our actual implementation.
ここでは、実際の実装についてのみ説明します。

We use the SIFT scale-space DoG detector and gradient orientation histogram descriptor [30], since we have found it to have high robustness and (with its GPU implementation [31]) reasonable computational time.
SIFT scale-space DoG 検出器と勾配方向ヒストグラム（HOG）記述子[30]を使用します。これは、SIFTが高いロバスト性を持ち、（GPU実装[31]を使用して）妥当な計算時間を持っていることがわかっているためです。

For each image in the map, we typically find 1000 to 2000 SIFT keypoints, and randomly select 50 of them to be stored in the database.
マップ内の各画像について、通常、1000から2000のSIFTキーポイントを見つけ、データベースに保存するためにそれらの内の50をランダムに選択します。

This limits the size of the database itself, as well as the data structures used for accelerating nearest-neighbor queries.
このため、データベース自体のサイズや、最近傍探索を高速化するためのデータ構造が制限されています。

We choose random selection after observing that features with higher DoG response are not necessarily highly repeatable features:they are just as likely to be due to noise, dust, etc.
DoGレスポンスが高い特徴は、必ずしも再現性の高い特徴ではないことを観察した上で、ランダムな選択を行っています：ノイズやほこりなどが原因である可能性も同じです。

To further speed up computation and reduce the size of the database,we apply PCA [32] to the set of SIFT descriptors and project each descriptor onto the top k principal components.
計算をさらに高速化し、データベースのサイズを縮小するために、PCA [32]をSIFT記述子のセットに適用し、各記述子を上位k個の主成分に投影します。

As described in Section IV, for good accuracy we typically use k = 8 or k = 16 in our implementation, and there is minimal cost to using a “universal” PCA basis constructed from a variety of textures, rather than a per-texture basis.
セクションIVで説明されたように、精度を高めるために、我々の実装では通常、k = 8またはk = 16を使用しています。また、テクスチャごとの基底ではなく、さまざまなテクスチャから構築された「ユニバーサル」なPCA基底を使用することには、最小限のコストしかかかりません。

One of the major advantages of our system is that the height of the camera is fixed, so that the scale of a particular feature is also fixed.
このシステムの大きな利点の1つは、カメラの高さが固定されているため、特定の特徴のスケールも固定されていることです。

This means that when searching for a feature with scale s in the database, we only need to check features with scale s as well.
つまり、データベース内のスケールsの特徴を検索する際には、スケールsの特徴をチェックするだけでよいことになります。

In practice, to allow some inconsistency, we quantize scale into 10 buckets and divide the database into 10 groups based on scale.
実際には、ある程度の不整合を許容するために、スケールを10個のバケットに量子化し、データベースをスケールに基づいて10個のグループに分割します。

Then we build a search index for each group using FLANN [33].
次に、FLANN [33]を使用して各グループの検索インデックスを作成します。

During testing, given a feature with scale s, we only need to search for the nearest neighbor in the group to which s belongs.
テスト中に、スケールsの特徴が与えられた場合、sが属するグループ内の最近傍を検索するだけで済みます。

V. APPLICATION: AUTOMATIC PATH FOLLOWING

Our system provides a simple, inexpensive solution to achieve fine absolute positioning, and mobile robots having such a requirement represent an ideal application.
私たちのシステムは、精密な絶対位置測定をシンプルかつ安価に実現しており、このような要求を持つ移動ロボットには理想的なアプリケーションです。

To demonstrate the practicality of this approach, we build a robot that is able to follow a designed path exactly without any initialization of the position.
このアプローチの実用性を示すために、位置を初期化することなく、設計された経路を正確にたどることができるロボットを作りました。

図8：経路追跡のデモンストレーション。
（a）Micro-GPSは移動ロボットのコンポーネントとして実装されています。
（b）ロボットを手動で駆動して経路を生成します。
（c）次に、Micro-GPSを使用して経路を繰り返します。
手動および自動運転モードでキャプチャされたスクリーンショットは、非常に一貫性があります。
（d）ロボットは同じ終了位置に高精度で到達します。

Our robot (shown in Figure 8a) has a differential drive composed of two 24V DC geared motors with encoders for closed-loop control of the motors.
私たちのロボット（図8aに示す）には、モーターの閉ループ制御用のエンコーダーを備えた2つの24VDCギア付きモーターで構成される差動ドライブがあります。

Using the encoder readings, we implemented dead-reckoning odometry on board to track the position of the robot at reasonable accuracy within a short distance.
このエンコーダの読み取り値を使って、推測航法オドメトリを実装し、近距離でのロボットの位置を合理的な精度で追跡しました。

The drift in odometry is corrected using Micro-GPS running on the onboard NVIDIA Jetson TX1 computer at ∼4fps.
オドメトリのドリフトは、NVIDIA Jetson TX1コンピュータ上で動作するMicro-GPSを用いて約4fpsで補正されます。

To test the repeatability of navigation using this strategy,we first manually drive the robot along a particular path (Figure 8b), and mark its final location on the ground using a piece of tape.
この戦略を使用してナビゲーションの再現性をテストするには、最初にロボットを特定の経路に沿って手動で駆動し（図8b）、テープを使用して地面の最終的な位置をマークします。

The robot then goes back to its starting position and re-plays the same path, fully automatically.
その後、ロボットはスタート位置に戻り、同じ経路を全自動で再生します。

The sequences of the manual driving and automatic re-play are shown in the accompanying video; screen-shots from the video are compared in Figure 8c.
手動運転と自動再生のシーケンスは、添付のビデオに示されています；ビデオのスクリーンショットを図8cで比較します。

As shown in Figure 8d,the robot ends up in almost exactly the same position after automatic path following as it did after the manual driving.
図8dに示すように、自動経路追従後のロボットは、手動走行後とほぼ同じ位置に到達しています。

REFERENCES

[1] N. Snavely, S. M. Seitz, and R. Szeliski, “Photo Tourism: Exploring
photo collections in 3D,” ACM Transactions on Graphics, vol. 25,
no. 3, pp. 835–846, Jul. 2006.
[2] S. Bae, A. Agarwala, and F. Durand, “Computational re-photography,”
ACM Transactions on Graphics, vol. 29, no. 3, pp. 24:1–24:15, Jul.
2010.
[3] K. Dana, B. van Ginneken, S. Nayar, and J. Koenderink, “Reflectance
and texture of real-world surfaces,” ACM Transactions on Graphics,
vol. 18, no. 1, pp. 1–34, 1999.
[4] T. Leung and J. Malik, “Representing and recognizing the visual appearance of materials using three-dimensional textons,” International
Journal of Computer Vision (IJCV), vol. 43, no. 1, pp. 29–44, Jun.
2001.
[5] D. Heeger and J. Bergen, “Pyramid-based texture analysis/synthesis,”
in Proc. ACM SIGGRAPH, 1995, pp. 229–238.
[6] A. Efros and T. Leung, “Texture synthesis by non-parametric sampling,” in IEEE International Conference on Computer Vision (ICCV),
1999, pp. 1033–1038.
[7] A. Kelly, B. Nagy, D. Stager, and R. Unnikrishnan, “An infrastructurefree automated guided vehicle based on computer vision,” IEEE
Robotics & Automation Magazine, vol. 14, no. 3, pp. 24–34, 2007.
[8] H. Fang, M. Yang, R. Yang, and C. Wang, “Ground-texture-based
localization for intelligent vehicles,” IEEE Transactions on Intelligent
Transportation Systems, vol. 10, no. 3, pp. 463–468, Sep. 2009.
[9] K. Kozak and M. Alban, “Ranger: A ground-facing camera-based
localization system for ground vehicles,” in IEEE/ION Position, Location and Navigation Symposium (PLANS), 2016, pp. 170–178.
[10] W. Clarkson, T. Weyrich, A. Finkelstein, N. Heninger, J. A. Halderman, and E. W. Felten, “Fingerprinting blank paper using commodity
scanners,” in IEEE Symposium on Security and Privacy, 2009, pp.
301–314.
[11] T. Sattler, B. Leibe, and L. Kobbelt, “Fast image-based localization
using direct 2D-to-3D matching,” in IEEE International Conference
on Computer Vision (ICCV), 2011, pp. 667–674.
[12] Y. Li, N. Snavely, D. Huttenlocher, and P. Fua, “Worldwide pose estimation using 3D point clouds,” in European Conference on Computer
Vision (ECCV), 2012, pp. 15–29.
[13] R. Mur-Artal, J. Montiel, and J. D. Tardos, “ORB-SLAM: A versatile ´
and accurate monocular slam system,” IEEE Transactions on Robotics,
vol. 31, no. 5, pp. 1147–1163, Oct. 2015.
[14] A. Kendall, M. Grimes, and R. Cipolla, “PoseNet: A convolutional
network for real-time 6-DOF camera relocalization,” in IEEE International Conference on Computer Vision (ICCV), 2015, pp. 2938–2946.
[15] S. Ramalingam, S. Bouaziz, P. Sturm, and M. Brand, “Geolocalization
using skylines from omni-images,” in IEEE International Conference
on Computer Vision (ICCV) Workshops, 2009, pp. 23–30.
[16] Y. Li, N. Snavely, and D. Huttenlocher, “Location recognition using
prioritized feature matching,” in European Conference on Computer
Vision (ECCV), 2010, pp. 791–804.
[17] S. Cao and N. Snavely, “Minimal scene descriptions from structure
from motion models,” in Computer Vision and Pattern Recognition
(CVPR), 2014, pp. 461–468.
[18] B. Zeisl, T. Sattler, and M. Pollefeys, “Camera pose voting for largescale image-based localization,” in IEEE International Conference on
Computer Vision (ICCV), 2015, pp. 2704–2712.
[19] Y. Avrithis and G. Tolias, “Hough pyramid matching: Speeded-up geometry re-ranking for large scale image retrieval,” International Journal of Computer Vision (IJCV), vol. 107, no. 1, pp. 1–19, Mar. 2014.
[20] X. Wu and K. Kashino, “Adaptive dither voting for robust spatial
verification,” in IEEE International Conference on Computer Vision
(ICCV), 2015, pp. 1877–1885.
[21] J. L. Schonberger, T. Price, T. Sattler, J.-M. Frahm, and M. Pollefeys, ¨
“A vote-and-verify strategy for fast spatial verification in image retrieval,” in Asian Conference on Computer Vision (ACCV), 2016, pp.
321–337.
[22] L. Svarm, O. Enqvist, F. Kahl, and M. Oskarsson, “City-scale localization for cameras with known vertical direction,” IEEE Transactions
on Pattern Analysis and Machine Intelligence (PAMI), pp. 1455–1461,
2017.
[23] G. Baatz, K. Koser, D. Chen, R. Grzeszczuk, and M. Pollefeys, “Han- ¨
dling urban location recognition as a 2D homothetic problem,” in European Conference on Computer Vision (ECCV), 2010, pp. 266–279.
[24] H. Lim, S. N. Sinha, M. F. Cohen, M. Uyttendaele, and H. J. Kim,
“Real-time monocular image-based 6-DoF localization,” International
Journal of Robotics Research, vol. 34, no. 4-5, pp. 476–492, Apr.
2015.
[25] S. Middelberg, T. Sattler, O. Untzelmann, and L. Kobbelt, “Scalable
6-DoF localization on mobile devices,” in European Conference on
Computer Vision (ECCV), 2014, pp. 268–283.
[26] A. Irschara, C. Zach, J.-M. Frahm, and H. Bischof, “From structurefrom-motion point clouds to fast location recognition,” in Computer
Vision and Pattern Recognition (CVPR), 2009, pp. 2599–2606.
[27] A. Wendel, A. Irschara, and H. Bischof, “Natural landmark-based
monocular localization for MAVs,” in IEEE International Conference
on Robotics and Automation (ICRA), 2011, pp. 5792–5799.
[28] M. Schonbein and A. Geiger, “Omnidirectional 3D reconstruction in ¨
augmented Manhattan worlds,” in IEEE/RSJ International Conference
on Intelligent Robots and Systems (IROS), 2014, pp. 716–723.
[29] C. Arth, M. Klopschitz, G. Reitmayr, and D. Schmalstieg, “Real-time
self-localization from panoramic images on mobile devices,” in IEEE
International Symposium on Mixed and Augmented Reality (ISMAR),
2011, pp. 37–46.
[30] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision (IJCV), vol. 60,
no. 2, pp. 91–110, Nov. 2004.
[31] C. Wu, “SiftGPU: A GPU implementation of scale invariant feature
transform (SIFT),” http://www.cs.unc.edu/∼ccwu/siftgpu/, 2007.
[32] K. Pearson, “LIII. On lines and planes of closest fit to systems of
points in space,” The London, Edinburgh, and Dublin Philosophical
Magazine and Journal of Science, vol. 2, no. 11, pp. 559–572, 1901.
[33] M. Muja and D. G. Lowe, “Fast approximate nearest neighbors with
automatic algorithm configuration,” in International Conference on
Computer Vision Theory and Applications (VISAPP), 2009, pp. 331–
340.
[34] L. Zhang and S. Rusinkiewicz, “Learning to detect features in texture
images,” in Computer Vision and Pattern Recognition (CVPR), 2018,
pp. 6325–6333.
[35] H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded up robust
features,” in European Conference on Computer Vision (ECCV), 2006,
pp. 404–417.
[36] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: An efficient alternative to SIFT or SURF,” in IEEE International Conference
on Computer Vision (ICCV), 2011, pp. 2564–2571.
[37] Y. Tian, B. Fan, F. Wu et al., “L2-Net: Deep learning of discriminative
patch descriptor in Euclidean space,” in Computer Vision and Pattern
Recognition (CVPR), 2017.
[38] A. Mishchuk, D. Mishkin, F. Radenovic, and J. Matas, “Working hard
to know your neighbor’s margins: Local descriptor learning loss,” in
Advances in Neural Information Processing Systems, 2017, pp. 4826–
4837.
[39] K. He, Y. Lu, and S. Sclaroff, “Local descriptors optimized for average
precision,” in Computer Vision and Pattern Recognition (CVPR), 2018,
pp. 596–605.
[40] X. Zhang, X. Y. Felix, S. Kumar, and S.-F. Chang, “Learning spreadout local feature descriptors,” in IEEE International Conference on
Computer Vision (ICCV), 2017, pp. 4605–4613.
[41] Z. Luo, T. Shen, L. Zhou, S. Zhu, R. Zhang, Y. Yao, T. Fang, and
L. Quan, “GeoDesc: Learning local descriptors by integrating geometry constraints,” European Conference on Computer Vision (ECCV),
2018.
[42] M. Keller, Z. Chen, F. Maffra, P. Schmuck, and M. Chli, “Learning
deep descriptors with scale-aware triplet networks,” in Computer Vision and Pattern Recognition (CVPR), 2018.
[43] S. A. Winder and M. Brown, “Learning local image descriptors,” in
Computer Vision and Pattern Recognition (CVPR), 2007, pp. 1–8.
[44] C. Wu, “VisualSFM: A visual structure from motion system,” http:
//ccwu.me/vsfm/, 2011.
[45] C. Wu, S. Agarwal, B. Curless, and S. M. Seitz, “Multicore bundle
adjustment,” in Computer Vision and Pattern Recognition (CVPR),
2011, pp. 3057–3064.
[46] M. Cornick, J. Koechling, B. Stanley, and B. Zhang, “Localizing
ground penetrating radar: a step toward robust autonomous ground
vehicle localization,” Journal of Field Robotics, vol. 33, no. 1, pp.
82–102, 2016

Appendix

「投票」とか「多数決」ってどういうこと？－＞だいたいこいうことです（例：ハフ変換）

Appendix2

OpenCVを使った特徴点のマッチング

Appendix3

PCA（主成分分析）を使った例

FRONT

地図と画像のサイト

Micro-GPS って何？（論文からの引用）

Be the first to comment

Leave a Reply コメントをキャンセル