blob: e0a7c7af6525c899c1f0a904b51bc530a265a30e [file] [log] [blame]
Heiner Kallweit25fe02d2019-01-26 11:25:37 +01001=====================
2PHY Abstraction Layer
3=====================
4
5Purpose
6=======
7
8Most network devices consist of set of registers which provide an interface
9to a MAC layer, which communicates with the physical connection through a
10PHY. The PHY concerns itself with negotiating link parameters with the link
11partner on the other side of the network connection (typically, an ethernet
12cable), and provides a register interface to allow drivers to determine what
13settings were chosen, and to configure what settings are allowed.
14
15While these devices are distinct from the network devices, and conform to a
16standard layout for the registers, it has been common practice to integrate
17the PHY management code with the network driver. This has resulted in large
18amounts of redundant code. Also, on embedded systems with multiple (and
19sometimes quite different) ethernet controllers connected to the same
20management bus, it is difficult to ensure safe use of the bus.
21
22Since the PHYs are devices, and the management busses through which they are
23accessed are, in fact, busses, the PHY Abstraction Layer treats them as such.
24In doing so, it has these goals:
25
26#. Increase code-reuse
27#. Increase overall code-maintainability
28#. Speed development time for new network drivers, and for new systems
29
30Basically, this layer is meant to provide an interface to PHY devices which
31allows network driver writers to write as little code as possible, while
32still providing a full feature set.
33
34The MDIO bus
35============
36
37Most network devices are connected to a PHY by means of a management bus.
38Different devices use different busses (though some share common interfaces).
39In order to take advantage of the PAL, each bus interface needs to be
40registered as a distinct device.
41
42#. read and write functions must be implemented. Their prototypes are::
43
44 int write(struct mii_bus *bus, int mii_id, int regnum, u16 value);
45 int read(struct mii_bus *bus, int mii_id, int regnum);
46
47 mii_id is the address on the bus for the PHY, and regnum is the register
48 number. These functions are guaranteed not to be called from interrupt
49 time, so it is safe for them to block, waiting for an interrupt to signal
50 the operation is complete
51
52#. A reset function is optional. This is used to return the bus to an
53 initialized state.
54
55#. A probe function is needed. This function should set up anything the bus
56 driver needs, setup the mii_bus structure, and register with the PAL using
57 mdiobus_register. Similarly, there's a remove function to undo all of
58 that (use mdiobus_unregister).
59
60#. Like any driver, the device_driver structure must be configured, and init
61 exit functions are used to register the driver.
62
63#. The bus must also be declared somewhere as a device, and registered.
64
65As an example for how one driver implemented an mdio bus driver, see
66drivers/net/ethernet/freescale/fsl_pq_mdio.c and an associated DTS file
67for one of the users. (e.g. "git grep fsl,.*-mdio arch/powerpc/boot/dts/")
68
69(RG)MII/electrical interface considerations
70===========================================
71
72The Reduced Gigabit Medium Independent Interface (RGMII) is a 12-pin
73electrical signal interface using a synchronous 125Mhz clock signal and several
74data lines. Due to this design decision, a 1.5ns to 2ns delay must be added
75between the clock line (RXC or TXC) and the data lines to let the PHY (clock
Jonathan Neuschäferea882f72019-10-03 22:43:22 +020076sink) have a large enough setup and hold time to sample the data lines correctly. The
Heiner Kallweit25fe02d2019-01-26 11:25:37 +010077PHY library offers different types of PHY_INTERFACE_MODE_RGMII* values to let
78the PHY driver and optionally the MAC driver, implement the required delay. The
79values of phy_interface_t must be understood from the perspective of the PHY
80device itself, leading to the following:
81
82* PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any
83 internal delay by itself, it assumes that either the Ethernet MAC (if capable
84 or the PCB traces) insert the correct 1.5-2ns delay
85
86* PHY_INTERFACE_MODE_RGMII_TXID: the PHY should insert an internal delay
87 for the transmit data lines (TXD[3:0]) processed by the PHY device
88
89* PHY_INTERFACE_MODE_RGMII_RXID: the PHY should insert an internal delay
90 for the receive data lines (RXD[3:0]) processed by the PHY device
91
92* PHY_INTERFACE_MODE_RGMII_ID: the PHY should insert internal delays for
93 both transmit AND receive data lines from/to the PHY device
94
95Whenever possible, use the PHY side RGMII delay for these reasons:
96
97* PHY devices may offer sub-nanosecond granularity in how they allow a
98 receiver/transmitter side delay (e.g: 0.5, 1.0, 1.5ns) to be specified. Such
99 precision may be required to account for differences in PCB trace lengths
100
101* PHY devices are typically qualified for a large range of applications
102 (industrial, medical, automotive...), and they provide a constant and
103 reliable delay across temperature/pressure/voltage ranges
104
105* PHY device drivers in PHYLIB being reusable by nature, being able to
106 configure correctly a specified delay enables more designs with similar delay
107 requirements to be operate correctly
108
109For cases where the PHY is not capable of providing this delay, but the
110Ethernet MAC driver is capable of doing so, the correct phy_interface_t value
111should be PHY_INTERFACE_MODE_RGMII, and the Ethernet MAC driver should be
112configured correctly in order to provide the required transmit and/or receive
113side delay from the perspective of the PHY device. Conversely, if the Ethernet
114MAC driver looks at the phy_interface_t value, for any other mode but
115PHY_INTERFACE_MODE_RGMII, it should make sure that the MAC-level delays are
116disabled.
117
118In case neither the Ethernet MAC, nor the PHY are capable of providing the
119required delays, as defined per the RGMII standard, several options may be
120available:
121
122* Some SoCs may offer a pin pad/mux/controller capable of configuring a given
123 set of pins'strength, delays, and voltage; and it may be a suitable
124 option to insert the expected 2ns RGMII delay.
125
126* Modifying the PCB design to include a fixed delay (e.g: using a specifically
127 designed serpentine), which may not require software configuration at all.
128
129Common problems with RGMII delay mismatch
130-----------------------------------------
131
132When there is a RGMII delay mismatch between the Ethernet MAC and the PHY, this
133will most likely result in the clock and data line signals to be unstable when
134the PHY or MAC take a snapshot of these signals to translate them into logical
1351 or 0 states and reconstruct the data being transmitted/received. Typical
136symptoms include:
137
138* Transmission/reception partially works, and there is frequent or occasional
139 packet loss observed
140
141* Ethernet MAC may report some or all packets ingressing with a FCS/CRC error,
142 or just discard them all
143
144* Switching to lower speeds such as 10/100Mbits/sec makes the problem go away
145 (since there is enough setup/hold time in that case)
146
147Connecting to a PHY
148===================
149
150Sometime during startup, the network driver needs to establish a connection
151between the PHY device, and the network device. At this time, the PHY's bus
152and drivers need to all have been loaded, so it is ready for the connection.
153At this point, there are several ways to connect to the PHY:
154
155#. The PAL handles everything, and only calls the network driver when
156 the link state changes, so it can react.
157
158#. The PAL handles everything except interrupts (usually because the
159 controller has the interrupt registers).
160
161#. The PAL handles everything, but checks in with the driver every second,
162 allowing the network driver to react first to any changes before the PAL
163 does.
164
165#. The PAL serves only as a library of functions, with the network device
166 manually calling functions to update status, and configure the PHY
167
168
169Letting the PHY Abstraction Layer do Everything
170===============================================
171
172If you choose option 1 (The hope is that every driver can, but to still be
173useful to drivers that can't), connecting to the PHY is simple:
174
175First, you need a function to react to changes in the link state. This
176function follows this protocol::
177
178 static void adjust_link(struct net_device *dev);
179
180Next, you need to know the device name of the PHY connected to this device.
181The name will look something like, "0:00", where the first number is the
182bus id, and the second is the PHY's address on that bus. Typically,
183the bus is responsible for making its ID unique.
184
185Now, to connect, just call this function::
186
187 phydev = phy_connect(dev, phy_name, &adjust_link, interface);
188
189*phydev* is a pointer to the phy_device structure which represents the PHY.
190If phy_connect is successful, it will return the pointer. dev, here, is the
191pointer to your net_device. Once done, this function will have started the
192PHY's software state machine, and registered for the PHY's interrupt, if it
193has one. The phydev structure will be populated with information about the
194current state, though the PHY will not yet be truly operational at this
195point.
196
197PHY-specific flags should be set in phydev->dev_flags prior to the call
198to phy_connect() such that the underlying PHY driver can check for flags
199and perform specific operations based on them.
200This is useful if the system has put hardware restrictions on
201the PHY/controller, of which the PHY needs to be aware.
202
203*interface* is a u32 which specifies the connection type used
204between the controller and the PHY. Examples are GMII, MII,
Russell King8c25c0c2019-06-21 15:59:09 +0100205RGMII, and SGMII. See "PHY interface mode" below. For a full
206list, see include/linux/phy.h
Heiner Kallweit25fe02d2019-01-26 11:25:37 +0100207
208Now just make sure that phydev->supported and phydev->advertising have any
209values pruned from them which don't make sense for your controller (a 10/100
210controller may be connected to a gigabit capable PHY, so you would need to
211mask off SUPPORTED_1000baseT*). See include/linux/ethtool.h for definitions
212for these bitfields. Note that you should not SET any bits, except the
213SUPPORTED_Pause and SUPPORTED_AsymPause bits (see below), or the PHY may get
214put into an unsupported state.
215
216Lastly, once the controller is ready to handle network traffic, you call
217phy_start(phydev). This tells the PAL that you are ready, and configures the
218PHY to connect to the network. If the MAC interrupt of your network driver
219also handles PHY status changes, just set phydev->irq to PHY_IGNORE_INTERRUPT
220before you call phy_start and use phy_mac_interrupt() from the network
221driver. If you don't want to use interrupts, set phydev->irq to PHY_POLL.
222phy_start() enables the PHY interrupts (if applicable) and starts the
223phylib state machine.
224
225When you want to disconnect from the network (even if just briefly), you call
226phy_stop(phydev). This function also stops the phylib state machine and
227disables PHY interrupts.
228
Russell King8c25c0c2019-06-21 15:59:09 +0100229PHY interface modes
230===================
231
232The PHY interface mode supplied in the phy_connect() family of functions
233defines the initial operating mode of the PHY interface. This is not
234guaranteed to remain constant; there are PHYs which dynamically change
235their interface mode without software interaction depending on the
236negotiation results.
237
238Some of the interface modes are described below:
239
240``PHY_INTERFACE_MODE_1000BASEX``
241 This defines the 1000BASE-X single-lane serdes link as defined by the
242 802.3 standard section 36. The link operates at a fixed bit rate of
243 1.25Gbaud using a 10B/8B encoding scheme, resulting in an underlying
244 data rate of 1Gbps. Embedded in the data stream is a 16-bit control
245 word which is used to negotiate the duplex and pause modes with the
246 remote end. This does not include "up-clocked" variants such as 2.5Gbps
247 speeds (see below.)
248
249``PHY_INTERFACE_MODE_2500BASEX``
250 This defines a variant of 1000BASE-X which is clocked 2.5 times faster,
251 than the 802.3 standard giving a fixed bit rate of 3.125Gbaud.
252
253``PHY_INTERFACE_MODE_SGMII``
254 This is used for Cisco SGMII, which is a modification of 1000BASE-X
255 as defined by the 802.3 standard. The SGMII link consists of a single
256 serdes lane running at a fixed bit rate of 1.25Gbaud with 10B/8B
257 encoding. The underlying data rate is 1Gbps, with the slower speeds of
258 100Mbps and 10Mbps being achieved through replication of each data symbol.
259 The 802.3 control word is re-purposed to send the negotiated speed and
260 duplex information from to the MAC, and for the MAC to acknowledge
261 receipt. This does not include "up-clocked" variants such as 2.5Gbps
262 speeds.
263
264 Note: mismatched SGMII vs 1000BASE-X configuration on a link can
265 successfully pass data in some circumstances, but the 16-bit control
266 word will not be correctly interpreted, which may cause mismatches in
267 duplex, pause or other settings. This is dependent on the MAC and/or
268 PHY behaviour.
269
270
Heiner Kallweit25fe02d2019-01-26 11:25:37 +0100271Pause frames / flow control
272===========================
273
274The PHY does not participate directly in flow control/pause frames except by
275making sure that the SUPPORTED_Pause and SUPPORTED_AsymPause bits are set in
276MII_ADVERTISE to indicate towards the link partner that the Ethernet MAC
277controller supports such a thing. Since flow control/pause frames generation
278involves the Ethernet MAC driver, it is recommended that this driver takes care
279of properly indicating advertisement and support for such features by setting
280the SUPPORTED_Pause and SUPPORTED_AsymPause bits accordingly. This can be done
281either before or after phy_connect() and/or as a result of implementing the
282ethtool::set_pauseparam feature.
283
284
285Keeping Close Tabs on the PAL
286=============================
287
288It is possible that the PAL's built-in state machine needs a little help to
289keep your network device and the PHY properly in sync. If so, you can
290register a helper function when connecting to the PHY, which will be called
291every second before the state machine reacts to any changes. To do this, you
292need to manually call phy_attach() and phy_prepare_link(), and then call
293phy_start_machine() with the second argument set to point to your special
294handler.
295
296Currently there are no examples of how to use this functionality, and testing
297on it has been limited because the author does not have any drivers which use
298it (they all use option 1). So Caveat Emptor.
299
300Doing it all yourself
301=====================
302
303There's a remote chance that the PAL's built-in state machine cannot track
304the complex interactions between the PHY and your network device. If this is
305so, you can simply call phy_attach(), and not call phy_start_machine or
306phy_prepare_link(). This will mean that phydev->state is entirely yours to
307handle (phy_start and phy_stop toggle between some of the states, so you
308might need to avoid them).
309
310An effort has been made to make sure that useful functionality can be
311accessed without the state-machine running, and most of these functions are
312descended from functions which did not interact with a complex state-machine.
313However, again, no effort has been made so far to test running without the
314state machine, so tryer beware.
315
316Here is a brief rundown of the functions::
317
318 int phy_read(struct phy_device *phydev, u16 regnum);
319 int phy_write(struct phy_device *phydev, u16 regnum, u16 val);
320
321Simple read/write primitives. They invoke the bus's read/write function
322pointers.
323::
324
325 void phy_print_status(struct phy_device *phydev);
326
327A convenience function to print out the PHY status neatly.
328::
329
330 void phy_request_interrupt(struct phy_device *phydev);
331
332Requests the IRQ for the PHY interrupts.
333::
334
335 struct phy_device * phy_attach(struct net_device *dev, const char *phy_id,
336 phy_interface_t interface);
337
338Attaches a network device to a particular PHY, binding the PHY to a generic
339driver if none was found during bus initialization.
340::
341
342 int phy_start_aneg(struct phy_device *phydev);
343
344Using variables inside the phydev structure, either configures advertising
345and resets autonegotiation, or disables autonegotiation, and configures
346forced settings.
347::
348
349 static inline int phy_read_status(struct phy_device *phydev);
350
351Fills the phydev structure with up-to-date information about the current
352settings in the PHY.
353::
354
Russell Kinge3cf8b362019-11-22 12:37:08 +0000355 int phy_ethtool_ksettings_set(struct phy_device *phydev,
356 const struct ethtool_link_ksettings *cmd);
Heiner Kallweit25fe02d2019-01-26 11:25:37 +0100357
358Ethtool convenience functions.
359::
360
361 int phy_mii_ioctl(struct phy_device *phydev,
362 struct mii_ioctl_data *mii_data, int cmd);
363
364The MII ioctl. Note that this function will completely screw up the state
365machine if you write registers like BMCR, BMSR, ADVERTISE, etc. Best to
366use this only to write registers which are not standard, and don't set off
367a renegotiation.
368
369PHY Device Drivers
370==================
371
372With the PHY Abstraction Layer, adding support for new PHYs is
373quite easy. In some cases, no work is required at all! However,
374many PHYs require a little hand-holding to get up-and-running.
375
376Generic PHY driver
377------------------
378
379If the desired PHY doesn't have any errata, quirks, or special
380features you want to support, then it may be best to not add
381support, and let the PHY Abstraction Layer's Generic PHY Driver
382do all of the work.
383
384Writing a PHY driver
385--------------------
386
387If you do need to write a PHY driver, the first thing to do is
388make sure it can be matched with an appropriate PHY device.
389This is done during bus initialization by reading the device's
390UID (stored in registers 2 and 3), then comparing it to each
391driver's phy_id field by ANDing it with each driver's
392phy_id_mask field. Also, it needs a name. Here's an example::
393
394 static struct phy_driver dm9161_driver = {
395 .phy_id = 0x0181b880,
396 .name = "Davicom DM9161E",
397 .phy_id_mask = 0x0ffffff0,
398 ...
399 }
400
401Next, you need to specify what features (speed, duplex, autoneg,
402etc) your PHY device and driver support. Most PHYs support
403PHY_BASIC_FEATURES, but you can look in include/mii.h for other
404features.
405
406Each driver consists of a number of function pointers, documented
407in include/linux/phy.h under the phy_driver structure.
408
409Of these, only config_aneg and read_status are required to be
410assigned by the driver code. The rest are optional. Also, it is
411preferred to use the generic phy driver's versions of these two
412functions if at all possible: genphy_read_status and
413genphy_config_aneg. If this is not possible, it is likely that
414you only need to perform some actions before and after invoking
415these functions, and so your functions will wrap the generic
416ones.
417
418Feel free to look at the Marvell, Cicada, and Davicom drivers in
419drivers/net/phy/ for examples (the lxt and qsemi drivers have
420not been tested as of this writing).
421
422The PHY's MMD register accesses are handled by the PAL framework
423by default, but can be overridden by a specific PHY driver if
424required. This could be the case if a PHY was released for
425manufacturing before the MMD PHY register definitions were
426standardized by the IEEE. Most modern PHYs will be able to use
427the generic PAL framework for accessing the PHY's MMD registers.
428An example of such usage is for Energy Efficient Ethernet support,
429implemented in the PAL. This support uses the PAL to access MMD
430registers for EEE query and configuration if the PHY supports
431the IEEE standard access mechanisms, or can use the PHY's specific
432access interfaces if overridden by the specific PHY driver. See
433the Micrel driver in drivers/net/phy/ for an example of how this
434can be implemented.
435
436Board Fixups
437============
438
439Sometimes the specific interaction between the platform and the PHY requires
440special handling. For instance, to change where the PHY's clock input is,
441or to add a delay to account for latency issues in the data path. In order
442to support such contingencies, the PHY Layer allows platform code to register
443fixups to be run when the PHY is brought up (or subsequently reset).
444
445When the PHY Layer brings up a PHY it checks to see if there are any fixups
446registered for it, matching based on UID (contained in the PHY device's phy_id
447field) and the bus identifier (contained in phydev->dev.bus_id). Both must
448match, however two constants, PHY_ANY_ID and PHY_ANY_UID, are provided as
449wildcards for the bus ID and UID, respectively.
450
451When a match is found, the PHY layer will invoke the run function associated
452with the fixup. This function is passed a pointer to the phy_device of
453interest. It should therefore only operate on that PHY.
454
455The platform code can either register the fixup using phy_register_fixup()::
456
457 int phy_register_fixup(const char *phy_id,
458 u32 phy_uid, u32 phy_uid_mask,
459 int (*run)(struct phy_device *));
460
461Or using one of the two stubs, phy_register_fixup_for_uid() and
462phy_register_fixup_for_id()::
463
464 int phy_register_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask,
465 int (*run)(struct phy_device *));
466 int phy_register_fixup_for_id(const char *phy_id,
467 int (*run)(struct phy_device *));
468
469The stubs set one of the two matching criteria, and set the other one to
470match anything.
471
472When phy_register_fixup() or \*_for_uid()/\*_for_id() is called at module,
473unregister fixup and free allocate memory are required.
474
475Call one of following function before unloading module::
476
477 int phy_unregister_fixup(const char *phy_id, u32 phy_uid, u32 phy_uid_mask);
478 int phy_unregister_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask);
479 int phy_register_fixup_for_id(const char *phy_id);
480
481Standards
482=========
483
484IEEE Standard 802.3: CSMA/CD Access Method and Physical Layer Specifications, Section Two:
485http://standards.ieee.org/getieee802/download/802.3-2008_section2.pdf
486
487RGMII v1.3:
488http://web.archive.org/web/20160303212629/http://www.hp.com/rnd/pdfs/RGMIIv1_3.pdf
489
490RGMII v2.0:
491http://web.archive.org/web/20160303171328/http://www.hp.com/rnd/pdfs/RGMIIv2_0_final_hp.pdf